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(54) Method and apparatus for encoding bitstreams for seamless reproduction 



(57) An optical disk having such a data structure 
that moving image data and audio data are naturally 
reproduced under one title without stoppage (freeze), 
etc.. of video display at the connections of system 
streams (VOB) in which the data are interleaved when 
the data are reproduced by connecting the system 

Fig.25 



streams (VOB) to each other. At least the first audio 
frame (At) contains the same audio data in a plurality of 
branched stream systems (VOB) and at least the last 
GOP contains the same moving picture in a plurality of 
system streams (VOB) before connected. 
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Description 

Technical Field 

[0001 ] The present invention relates to a method and s 
apparatus for system encoding bitstreams to connect 
seamlessly thereof and, more particularly, bitstreams 
for use in an authoring system for variously processing 
a data bitstream comprising the video data, audio data, 
and sub-picture data constituting each of plural program io 
titles containing related video data, audio data, and sub- 
picture data content to generate a bitstream from which 
a new title containing the content desired by the user 
can be reproduced, and efficiently recording and repro- 
ducing said generated bitstream using a particular is 
recording medium. 

Background Art 

[0002] Authoring systems used to produce program 20 
titles comprising related video data, audio data, and 
sub-picture data by digitally processing, for example, 
multimedia data comprising video, audio, and sub-pic- 
ture data recorded to laser disk or video CD formats are 
currently available. 25 
[0003] Systems using Video-CDs in particular are 
able to record video data to a CD format disk, which was 
originally designed with an approximately 600 MB 
recording capacity for storing digital audio data only, by 
using such high efficiency video compression tech- 30 
niques as MPEG. As a result of the increased effective 
recording capacity achieved using data compression 
techniques, karaoke titles and other conventional laser 
disk applications are gradually being transferred to the 
video CD format. 35 
[0004] Users today expect both sophisticated title con- 
tent and high reproduction quality. To meet these expec- 
tations, each title must be composed from bitstreams 
with an increasingly deep hierarchical structure. The 
data size of multimedia titles written with bitstreams 40 
having such deep hierarchical structures, however, is 
ten or more times greater than the data size of less 
complex titles. The need to edit small image (title) 
details also makes it necessary to process and control 
the bitstream using low order hierarchical data units. 45 
[0005] It is therefore necessary to develop and prove 
a bitstream structure and an advanced digital process- 
ing method including both recording and reproduction 
capabilities whereby a large volume, multiple level hier- 
archical digital bitstream can be efficiently controlled at so 
each level of the hierarchy. Also needed are an appara- 
tus for executing this digital processing method, and a 
recording media to which the bitstream digitally proc- 
essed by said apparatus can be efficiently recorded for 
storage and from which said recorded information can 55 
be quickly reproduced. 

[0006] Means of increasing the storage capacity of 
conventional optical disks have been widely researched 
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to address the recording medium aspect of this prob- 
lem. One way to increase the storage capacity of the 
optical disk is to reduce the spot diameter D of the opti- 
cal (laser) beam. If the wavelength of the laser beam is 
I and the aperture of the objective lens is NA, then the 
spot diameter D is proportional .to l/NA, and the storage 
capacity can be efficiently improved by decreasing I and 
increasing NA. 

[0007] As described, for example, in United States 
Patent 5,235,581 , however, coma caused by a relative 
tilt between the disk surface and the optical axis of the 
laser beam (hereafter "tilt") increases when a large 
aperture (high NA) lens is used. To prevent tilt-induced 
coma, the transparent substrate must be made very 
thin. The problem is that the mechanical strength of the 
disk is low when the transparent substrate is very thin. 
[0008] MPEG1 , the conventional method of recording 
and reproducing video, audio, and graphic signal data, 
has also been replaced by the more robust MPEG2 
method, which can transfer large data volumes at a 
higher rate. It should be noted that the compression 
method and data format of the MPEG2 standard differ 
somewhat from those of MPEG1 . The specific content 
of and differences between MPEG1 and MPEG2 are 
described in detail in the ISO-11172 and ISO-13818 
MPEG standards, and further description thereof is 
omitted below. 

[0009] Note, however, that while the structure of the 
encoded video stream is defined in the MPEG2 specifi- 
cation, the hierarchical structure of the system stream 
and the method of processing lower hierarchical levels 
are not defined. 

[001 0] As described above, it is therefore not possible 
in a conventional authoring system to process a large 
data stream containing sufficient information to satisfy 
many different user requirements. Moreover, even if 
such a processing method were available, the proc- 
essed data recorded thereto cannot be repeatedly used 
to reduce data redundancy because there is no large 
capacity recording medium currently available that can 
efficiently record and reproduce high volume bitstreams 
such as described above. 

[0011] More specifically particular significant hard- 
ware and software requirements must be satisfied in 
order to process a bitstream using a data unit smaller 
than the title. These specific hardware requirements 
include significantly increasing the storage capacity of 
the recording medium and increasing the speed of dig- 
ital processing; software requirements include inventing 
an advanced digital processing method including a 
sophisticated data structure. 

[001 2] Therefore, the object of the present invention is 
to provide an effective authoring system for controlling a 
multimedia data bitstream with advanced hardware and 
software requirements using a data unit smaller than 
the title to better address advanced user requirements. 
[001 3] To share data between plural titles and thereby 
efficiently utilize optical disk capacity, multi-scene con- 
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trol whereby scene data common to plural titles and the 
desired scenes on the same time-base from within 
multi-scene periods containing plural scenes unique to 
particular reproduction paths can be freely selected and 
reproduced is desirable. 5 
[0014] However, when plural scenes unique to a 
reproduction path within the multi-scene period are 
arranged on the same time-base, the scene data must 
be contiguous. Unselected multi-scene data is therefore 
unavoidably inserted between the selected common 10 
scene data and the selected multi-scene data. The 
problem this creates when reproducing multi-scene 
data is that reproduction is interrupted by this unse- 
lected scene data. 

[001 5] When one of the multiple scenes is connected is 
to common scene data, the difference between the 
video reproduction time and the audio reproduction time 
differs on each of the reproduction paths because of the 
offset between the audio and video frame reproduction 
times. As a result, the audio or video buffer underflows 20 
at the scene connection, causing video reproduction to 
stop ("freeze") or audio reproduction to stop ("mute"), 
and thus preventing seamless reproduction. It will also 
be obvious that the difference between the audio and 
video reproduction times can cause a buffer underflow 25 
state even when common scene data is connected 1 :1. 
[001 6] Therefore, the object of the present invention is 
to provide a data structure whereby multi-scene data 
can be naturally reproduced as a single title without the 
video presentation stopping ("freezing") at one-to-one, 30 
one-to-many, or many-to-many scene connections in 
the system stream; a method for generating a system 
stream having said data structure; a recording appara- 
tus and a reproduction apparatus for recording and 
reproducing said system stream; and a medium to 35 
which said system stream can be recorded and from 
which said system stream can be reproduced by said 
recording apparatus and reproduction apparatus. 
[0017] The present application is based upon Japa- 
nese Patent Application No. 7-252735 and 8-041581, *o 
which were filed on September 29, 1 995 and February 
28, 1996, respectively, the entire contents of which are 
expressly incorporated by reference herein. 

Disclosure of Invention 45 

[001 8] The present invention has been developed with 
a view to substantially solving the above described dis- 
advantages and has for its essential object to provide an 
optical disk for recording more than one system stream so 
containing audio data and video data, wherein the audio 
data and video data of the plural system streams 
recorded to the optical disk are interleaved such that the 
difference between the input start times of the video 
data and audio data to the video buffer in the video ss 
decoder and the audio buffer in the audio decoder is 
less than the reproduction time of the number of audio 
frames that can be stored in the audio buffer plus one 
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audio frame. 

Brief Description of Drawings 
[0019] 

Fig. 1 is a graph schematically showing a structure 
of multi media bit stream according to the present 
invention, 

Fig. 2 is a block diagram showing an authoring 
encoder according to the present invention, 
Fig. 3 is a block diagram showing an authoring 
decoder according to the present invention, 
Fig. 4 is a side view of an optical disk storing the 
multi media bit stream of Fig. 1 , 
Fig. 5 is an enlarged view showing a portion con- 
fined by a circle of Fig. 4, 

Fig. 6 is an enlarged view showing a portion con- 
fined by a circle of Fig. 5, 

Fig. 7 is a side view showing a variation of the opti- 
cal disk of Fig. 4, 

Fig. 8 is a side view showing another variation of 
the optical disk of Fig. 4, 

Fig. 9 is a plan view showing one example of track 
path formed on the recording surface of the optical 
disk of Fig. 4, 

Fig. 10 is a plan view showing another example of 
track path formed on the recording surface of the 
optical disk of Fig. 4, 

Fig. 1 1 is a diagonal view schematically showing 
one example of a track path pattern formed on the 
optical disk of Fig. 7, 

Fig. 12 is a plan view showing another example of 
track path formed on the recording surface of the 
optical disk of Fig. 7, 

Fig. 13 is a diagonal view schematically showing 
one example of a track path pattern formed on the 
optical disk of Fig. 8, 

Fig. 14 is a plan view showing another example of 
track path formed on the recording surface of the 
optical disk of Fig. 8, 

Fig. 15 is a flow chart showing details of the 
decoder synchronization process of Fig. 66, 
Fig. 16 is a graph schematically showing the struc- 
ture of multimedia bit stream for use in Digital Video 
Disk system according to the present invention, 
Hg. 17 is a graph schematically showing the 
encoded video stream according to the present 
invention, 

Fig. 1 8 is a graph schematically showing an internal 

structure of a video zone of Fig. 16. 

Fig. 19 is a graph schematically showing the stream 

management information according to the present 

invention, 

Fig. 20 is a graph schematically showing the struc- 
ture the navigation pack NV of Fig. 17, 
Fig. 21 is a graph in assistance of explaining a con- 
cept of parental lock playback control according to 
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the present invention. 

Fig. 22 is a graph schematically showing the data 
structure used in a digital video disk system accord- 
ing to the present invention, 

Fig. 23 is a graph in assistance of explaining a con- s 
cept of Multi-angle scene control according to the 
present invention. 

Fig. 24 is a graph in assistance of explaining a con- 
cept of multi scene data connection. 
Fig. 25 is a block diagram showing a DVD encoder w 
according to the present invention. 
Fig. 26 is a block diagram showing a DVD decoder 
according to the present invention, 
Fig. 27 is a graph schematically showing an encod- 
ing information table generated by the encoding is 
system controller of Fig. 25. 
Fig. 28 is a graph schematically showing an encod- 
ing information tables, 

Fig. 29 is a graph schematically showing an encod- 
ing parameters used by the video encoder of Fig. 20 
25, 

Fig. 30 is a graph schematically showing an exam- 
ple of the contents of the program chain information 
according to the present invention, 
Fig. 31 is a graph schematically showing another 25 
example of the contents of the program chain infor- 
mation according to the present invention, 
Fig. 32 is a flow chart showing the encode parame- 
ters generating operation for a system stream con- 
taining a single scene, 30 
Fig. 33 is a graph in assistance of explaining a con- 
cept of multi-angle scene control according to the 
present in invention. 

Fig. 34 is a flow chart, formed by Figs. 34A and 
34B, showing an operation of the DVD encoder of 35 
Fig. 25. 

Fig. 35 is a flow chart showing detailed of the 
encode parameter production sub-routine of Fig. 
34. 

Fig. 36 is a flow chart showing the detailed of the 40 
VOB data setting routine of Fig. 35, 
Fig. 37 is a flow chart showing the encode parame- 
ters generating operation for a seamless switching, 
Fig. 38 is a flow chart showing the encode parame- 
ters generating operation for a system stream. 45 
Fig. 39 is a graph showing simulated results of data 
input/output to the video buffer and audio buffer of 
the DVD decoder of Fig. 26. 
Fig. 40 is a graph in assistance of explaining a con- 
cept of parental control according to the present so 
invention, 

Fig. 41 is a graph in assistance of explaining the 
data input/output to the video buffer of the DVD 
decoder DCD shown in Fig. 26 during contiguous 
reproduction. 55 
Fig. 42 is a graph in assistance of explaining a pos- 
sible problem under a parental lock control example 
shown in Fig. 40, 



Fig. 43 is a graph in assistance of explaining a 
reproduction gap generated under parental lock 
control, 

Fig. 44 is a graph showing a system streams pro- 
duced according to the present invention, 
Fig. 45 is a graph in assistance of explaining an 
operation whereby these system streams are con- 
nected. 

Fig. 46 is a graph in assistance of explaining a 
method of generating a system streams, 
Fig. 47 is a graph in assistance of explaining 
another method of producing a system stream, 
Fig. 48 is a graph showing a structure of the end of 
the second common system stream and the begin- 
nings of the two parental lock control system 
streams, 

Fig. 49 is a graph in assistance of explaing the dif- 
ference in the video reproduction time and audio 
reproduction time of different reproduction paths, 
Fig. 50 is a block diagram showing an internal 
structure of the system encoder in the DVD encoder 
of Fig. 25. 

Fig. 51 is a graph showing a structure of the end of 
the two parental lock control system streams and 
the beginning of the following common system 
stream Sse, 

Fig. 52 is a graph in assistance of explaining the dif- 
ference in the video reproduction time and audio 
reproduction time of different reproduction paths, 
Fig. 53 is a flow chart showing details of system 
stream producing routine of Fig. 34, 
Fig. 54 is a graph in assistance of explaining an 
operation to calculate an audio data movement 
MFApl. 

Fig. 55 is a graph in assistance of explaining an 
operation to calculate an audio data movement 
MFAp2. 

Fig. 56 is a block diagram showing an internal 
structure of the synchronizer of Fig. 26, 
Fig. 57 is a flow chart showing an operation exe- 
cuted by the audio decoder controller of Fig. 26, 
Figs. 58 and 59 are graphs showing decoding infor- 
mation table produced by the decoding system con- 
troller of Fig. 26, 

Fig. 60 is a flow chart showing the operation of the 
DVD decoder DCD of Fig. 26, 
Fig. 61 is a flow chart showing details of reproduc- 
tion extracted PGC routing of Fig. 60, 
Fig. 62 is a flow chat showing details of the stream 
buffer data transfer process according to the 
present invention, 

Fig. 63 is a flow chart showing details of the non 
multi-angle decoding process of Fig. 62, 
Fig. 64 is a flow chart showing details of the non- 
multi-angled interleave process of Fig. 63, 
Fig. 65 is a flow chart showing details of the non- 
multi-angled contiguous block process, 
Fig. 66 is a flow chart showing details of decoding 
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data process of Fig. 64, performed by the stream 
buffer, is shown. 

Fig. 67 is a graph schematically showing an actual 
arrangement of data blocks recorded to a data 
recording track on a recording medium according to 
the present invention, 

Fig. 68 is a graph schematically showing contigu- 
ous block regions and interleaved block regions 
array, 

Fig. 69 is a graph schematically showing a content 
of a VTS title VOBS according to the present inven- 
tion, and 

Fig. 70 is a graph schematically showing an internal 
data structure of the interleaved block regions 
according to the present invention. 

Best Mode for Carrying Out the Invention 

[0020] The prevent invention is detailedly described 
with reference to the accompanying drawings. 

Data structure of the au thoring system 

[0021] The logic structure of the multimedia data bit- 
stream processed using the recording apparatus, 
recording medium, reproduction apparatus, and author- 
ing system according to the present invention is 
described first below with reference to Fig. 1 . 
[0022] In this structure, one title refers to the combina- 
tion of video and audio data expressing program con- 
tent recognized by a user for education, entertainment, 
or other purpose. Referenced to a motion picture 
(movie), one title may correspond to the content of an 
entire movie, or to just one scene within said movie. 
[0023] A video title set (VTS) comprises the bitstream 
data containing the information for a specific number of 
titles. More specifically, each VTS comprises the video, 
audio, and other reproduction data representing the 
content of each title in the set, and control data for con- 
trolling the content data. 

[0024] The video zone VZ is the video data unit proc- 
essed by the authoring system, and comprises a spe- 
cific number of video title sets. More specifically, each 
video zone is a linear sequence of K + 1 video title sets 
numbered VTS #0 - VTS #K where K is an integer value 
of zero or greater. One video title set, preferably the first 
video title set VTS #0, is used as the video manager 
describing the content information of the titles contained 
in each video title set. 

[0025] The multimedia bitstream MBS is the largest 
control unit of the multimedia data bitstream handled by 
the authoring system of the present invention, and com- 
prises plural video zones VZ. 

Authoring encoder EC 

[0026] A preferred embodiment of the authoring 
encoder EC according to the present invention for gen- 



erating a new multimedia bitstream MBS by re-encoding 
the original multimedia bitstream MBS according to the 
scenario desired by the user is shown in Fig. 2. Note 
that the original multimedia bitstream MBS comprises a 

5 video stream St1 containing the video information, a 
sub-picture stream St3 containing caption text and other 
auxiliary video information, and the audio stream St5 
containing the audio information. 
[0027] The video and audio streams are the brt- 

io streams containing the video and audio information 
obtained from the source within a particular period of 
time. The sub-picture stream is a bitstream containing 
momentary video information relevant to a particular 
scene. The sub-picture data encoded to a single scene 

75 may be captured to video memory and displayed contin- 
uously from the video memory for plural scenes as may 
be necessary. 

[0028] When this multimedia source data Stl , St3, 
and St5 is obtained from a live broadcast, the video and 

20 audio signals are supplied in real-time from a video 
camera or other imaging source; when the multimedia 
source data is reproduced from a video tape or other 
recording medium, the audio and video signals are not 
real-time signals. 

25 [0029] While the multimedia source stream is shown 
in Fig. 2 as comprising these three source signals, this 
is for convenience only, and rt should be noted that the 
multimedia source stream may contain more than three 
types of source signals, and may contain source data 

30 for different titles. Multimedia source data with audio, 
video, and sub-picture data for plural titles are referred 
to below as multi-title streams. 

[0030] As shown in Fig. 2, the authoring encoder EC 
comprises a scenario editor 100, encoding system con- 

35 troller 200, video encoder 300, video stream buffer 400, 
sub-picture encoder 500, sub-picture stream buffer 600, 
audio encoder 700, audio stream buffer 800, system 
encoder 900, video zone formatter 1300, recorder 1200, 
and recording medium M. 

40 [0031] The video zone formatter 1300 comprises 
video object (VOB) buffer 1000, formatter 1 100. and vol- 
ume and file structure formatter 1400. 
[0032] The bitstream encoded by the authoring 
encoder EC of the present embodiment is recorded by 

45 way of example only to an optical disk. 

[0033] The scenario editor 100 of the authoring 
encoder EC outputs the scenario data, i.e., the user- 
defined editing instructions. The scenario data controls 
editing the corresponding parts of the multimedia bit- 

50 stream MBS according to the user's manipulation of the 
video, sub-picture, and audio components of the origi- 
nal multimedia title. This scenario editor 100 preferably 
comprises a display, speaker(s), keyboard, CPU, and 
source stream buffer. The scenario editor 100 is con- 

55 nected to an external multimedia bitstream source from 
which the multimedia source data St1, St3, and St5 are 
supplied. 

[0034] The user is thus able to reproduce the video 
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and audio components of the multimedia source data 
using the display and speaker to confirm the content of 
the generated title. The user is then able to edit the title 
content according to the desired scenario using the key- 
board, mouse, and other command input devices while 5 
confirming the content of the title on the display and 
speakers. The result of this multimedia data manipula- 
tion is the scenario data St7. 

[0035] The scenario data St7 is basically a set of 
instructions describing what source data is selected 10 
from all or a subset of the source data containing plural 
titles within a defined time period, and how the selected 
source data is reassembled to reproduce the scenario 
(sequence) intended by the user. Based on the instruc- 
tions received through the keyboard or other control is 
device, the CPU codes the position, length, and the rel- 
ative time-based positions of the edited parts of the 
respective multimedia source data streams St1, St3, 
and St5 to generate the scenario data St7. 
[0036] The source stream buffer has a specific capac- 20 
ity, and is used to delay the multimedia source data 
streams St1, St3, and St5 a known time Td and then 
output streams Stl , St3, and St5. 
[0037] This delay is required for synchronization with 
the editor encoding process. More specifically, when 25 
data encoding and user generation of scenario data St7 
are executed simultaneously, i.e., when encoding imme- 
diately follows editing, time Td is required to determine 
the content of the multimedia source data editing proc- 
ess based on the scenario data St7 as will be described 30 
further below. As a result, the multimedia source data 
must be delayed by time Td to synchronize the editing 
process during the actual encoding operation. Because 
this delay time Td is limited to the time required to syn- 
chronize the operation of the various system compo- 35 
nents in the case of sequential editing as described 
above, the source stream buffer is normally achieved by 
means of a high speed storage medium such as semi- 
conductor memory. 

[0038] During batch editing in which all multimedia 40 
source data is encoded at once ("batch encoded") after 
scenario data St7 is generated for the complete title, 
delay time Td must be long enough to process the com- 
plete title or longer. In this case, the source stream 
buffer may be a low speed, high capacity storage 45 
medium such as video tape, magnetic disk, or optical 
disk. 

[0039] The structure (type) of media used for the 
source stream buffer may therefore be determined 
according to the delay time Td required and the allowa- so 
ble manufacturing cost. 

[0040] The encoding system controller 200 is con- 
nected to the scenario editor 100 and receives the sce- 
nario data St7 therefrom. Based on the time-base 
position and length information of the edit segment con- ss 
tained in the scenario data St7, the encoding system 
controller 200 generates the encoding parameter sig- 
nals St9, St11, and St13 for encoding the edit segment 



of the multimedia source data. The encoding signals 
St9, St11, and Stl 3 supply the parameters used for 
video, sub-picture, and audio encoding, including the 
encoding start and end timing. Note that multimedia 
source data St1 , St3, and St5 are output after delay time 
Td by the source stream buffer, and are therefore syn- 
chronized to encoding parameter signals St9, St1 1 , and 
St13. 

[0041] More specifically, encoding parameter signal 
St9 is the video encoding signal specifying the encoding 
timing of video stream Stl to extract the encoding seg- 
ment from the video stream St1 and generate the video 
encoding unit. Encoding parameter signal St1 1 is like- 
wise the sub-picture stream encoding signal used to 
generate the sub-picture encoding unit by specifying the 
encoding timing for sub-picture stream St3. Encoding 
parameter signal Stl 3 is the audio encoding signal used 
to generate the audio encoding unit by specifying the 
encoding timing for audio stream St5. 
[0042] Based on the time-base relationship between 
the encoding segments of streams Stl, St3, and St5 in 
the multimedia source data contained in scenario data 
St7, the encoding system controller 200 generates the 
timing signals St21, St23, and St25 arranging the 
encoded multimedia-encoded stream in the specified 
time-base relationship. 

[0043] The encoding system controller 200 also gen- 
erates the reproduction time information IT defining the 
reproduction time of the title editing unit (video object, 
VOB), and the stream encoding data St33 defining the 
system encode parameters for multiplexing the 
encoded multimedia stream containing video, audio, 
and sub-picture data. Note that the reproduction time 
information IT and stream encoding data St33 are gen- 
erated for the video object VOB of each title in one video 
zone VZ. 

[0044] The encoding system controller 200 also gen- 
erates the title sequence control signal St39, which 
declares the formatting parameters for formatting the 
title editing units VOB of each of the streams in a partic- 
ular time-base relationship as a multimedia bitstream. 
More specifically, the title sequence control signal St39 
is used to control the connections between the title edit- 
ing units (VOB) of each title in the multimedia bitstream 
MBS, or to control the sequence of the interleaved title 
editing unit (VOBs) interleaving the title editing units 
VOB of plural reproduction paths. 
[0045] The video encoder 300 is connected to the 
source stream buffer of the scenario editor 100 and to 
the encoding system controller 200, and receives there- 
from the video stream St1 and video encoding parame- 
ter signal St9, respectively. Encoding parameters 
supplied by the video encoding signal St9 include the 
encoding start and end timing, bit rate, the encoding 
conditions for the encoding start and end, and the mate- 
rial type. Possible material types include NTSC or PAL 
video signal, and telecine converted material. Based on 
the video encoding parameter signal St9, the video 
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encoder 300 encodes a specific part of the video stream 
Stl to generate the encoded video stream St15. 
[0046] The sub-picture encoder 500 is similarly con- 
nected to the source stream buffer of the scenario editor 
100 and to the encoding system controller 200, and 
receives therefrom the sub-picture stream St3 and sub- 
picture encoding parameter signal St11, respectively. 
Based on the sub-picture encoding parameter signal 
St11, the sub-picture encoder 500 encodes a specific 
part of the sub-picture stream St3 to generate the 
encoded sub-picture stream St1 7. 
[0047] The audio encoder 700 is also connected to the 
source stream buffer of the scenario editor 100 and to 
the encoding system controller 200, and receives there- 
from the audio stream St5 and audio encoding parame- 
ter signal St13, which supplies the encoding start and 
end timing. Based on the audio encoding parameter sig- 
nal St13, the audio encoder 700 encodes a specific part 
of the audio stream St5 to generate the encoded audio 
stream St 19. 

[0048] The video stream buffer 400 is connected to 
the video encoder 300 and to the encoding system con- 
troller 200. The video stream buffer 400 stores the 
encoded video stream St15 input from the video 
encoder 300, and outputs the stored encoded video 
stream St15 as the time-delayed encoded video stream 
St27 based on the timing signal St21 supplied from the 
encoding system controller 200. 

[0049] The sub-picture stream buffer 600 is similarly 
connected to the sub-picture encoder 500 and to the 
encoding system controller 200. The sub-picture stream 
buffer 600 stores the encoded sub-picture stream St17 
output from the sub-picture encoder 500, and then out- 
puts the stored encoded sub-picture stream Stl7 as 
time-delayed encoded sub-picture stream St29 based 
on the timing signal St23 supplied from the encoding 
system controller 200. 

[0050] The audio stream buffer 800 is similarly con- 
nected to the audio encoder 700 and to the encoding 
system controller 200. The audio stream buffer 800 
stores the encoded audio stream St 19 input from the 
audio encoder 700, and then outputs the encoded audio 
stream St1 9 as the time-delayed encoded audio stream 
St31 based on the timing signal St25 supplied from the 
encoding system controller 200. 
[0051] The system encoder 900 is connected to the 
video stream buffer 400, sub-picture stream buffer 600, 
audio stream buffer 800, and the encoding system con- 
troller 200, and is respectively supplied thereby with the 
time-delayed encoded video stream St27, time-delayed 
encoded sub-picture stream St29, time-delayed 
encoded audio stream St31 , and the stream encoding 
data St33. Note that the system encoder 900 is a multi- 
plexer that multiplexes the time-delayed streams St27, 
St29, and St31 based on the stream encoding data St33 
(timing signal) to generate title editing unit (VOB) St35. 
The stream encoding data St33 contains the system 
encoding parameters, including the encoding start and 



end timing. 

[0052] The video zone formatter 1 300 is connected to 
the system encoder 900 and the encoding system con- 
troller 200 from which the title editing unit (VOB) St35 

5 and title sequence control signal St39 (timing signal) are 
respectively supplied. The title sequence control signal 
St39 contains the formatting start and end timing, and 
the formatting parameters used to generate (format) a 
multimedia bitstream MBS. The video zone formatter 

10 1 300 rearranges the title editing units (VOB) St35 in one 
video zone VZ in the scenario sequence defined by the 
user based on the title sequence control signal St39 to 
generate the edited multimedia stream data St43. 
[0053] The multimedia bitstream MBS St43 edited 

r5 according to the user-defined scenario is then sent to 
the recorder 1200. The recorder 1200 processes the 
edited multimedia stream data St43 to the data stream 
St45 format of the recording medium M, and thus 
records the formatted data stream St45 to the recording 

20 medium M. Note that the multimedia bit stream MBS 
recorded to the recording medium M contains the vol- 
ume file structure VFS, which includes the physical 
address of the data on the recording medium generated 
by the video zone formatter 1300. 

25 [0054] Note that the encoded multimedia bit stream 
MBS St35 may be output directly to the decoder to 
immediately reproduce the edited title content, ft will be 
obvious that the output multimedia bitstream MBS will 
not in this case contain the volume file structure VFS. 

30 

Authoring decoder DC 

[0055] A preferred embodiment of the authoring 
decoder DC used to decode the multimedia bitstream 

35 MBS edited by the authoring encoder EC of the present 
invention, and thereby reproduce the content of each 
title unit according to the user-defined scenario, is 
described next below with reference to Fig. 3. Note that 
in the preferred embodiment described below the mutti- 

40 media bitstream St45 encoded by the authoring 
encoder EC is recorded to the recording medium M. 
[0056] As shown in Fig. 3, the authoring decoder DC 
comprises a multimedia bitstream producer 2000, sce- 
nario selector 2100, decoding system controller 2300, 

45 stream buffer 2400, system decoder 2500, video buffer 
2600, sub-picture buffer 2700, audio buffer 2800, syn- 
chronizer 2900, video decoder 3800, sub-picture 
decoder 3100, audio decoder 3200, synthesizer 3500, 
video data output terminal 3600. and audio data output 

50 terminal 3700. 

[0057] The bitstream producer 2000 comprises a 
recording media drive unit 2004 for driving the recording 
medium M; a reading head 2006 for reading the infor- 
mation recorded to the recording medium M and pro- 

55 ducing the binary read signal St57; a signal processor 
2008 for variously processing the read signal St57 to 
generate the reproduced bitstream St61 ; and a repro- 
duction controller 2002. 
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[0058] The reproduction controller 2002 is connected 
to the decoding system controller 2300 from which the 
multimedia bitstream reproduction control signal St53 is 
supplied, and in turn generates the reproduction control 
signals St55 and St59 respectively controlling the 5 
recording media drive unit (motor) 2004 and signal 
processor 2008. 

[0059] So that the user-defined video, sub-picture, 
and audio portions of the multimedia title edited by the 
authoring encoder EC are reproduced, the authoring 10 
decoder DC comprises a scenario selector 2100 for 
selecting and reproducing the corresponding scenes 
(titles). The scenario selector 2100 then outputs the 
selected titles as scenario data to the authoring decoder 
DC. is 
[0060] The scenario selector 2100 preferably com- 
prises a keyboard, CPU, and monitor. Using the key- 
board, the user then inputs the desired scenario based 
on the content of the scenario input by the authoring 
encoder EC. Based on the keyboard input, the CPU 20 
generates the scenario selection data St51 specifying 
the selected scenario. The scenario selector 2100 is 
connected by an infrared communications device, for 
example, to the decoding system controller 2300, to 
which it inputs the scenario selection data St51 . 25 
[0061 ] Based on the scenario selection data St51 , the 
decoding system controller 2300 then generates the bit- 
stream reproduction control signal St53 controlling the 
operation of the bitstream producer 2000. 
[0062] The stream buffer 2400 has a specific buffer 30 
capacity used to temporarily store the reproduced bit- 
stream St61 input from the bitstream producer 2000, 
extract the address information and initial synchroniza- 
tion data SCR (system clock reference) for each stream, 
and generate bitstream control data St63. The stream as 
buffer 2400 is also connected to the decoding system 
controller 2300, to which it supplies the generated bit- 
stream control data St63. 

[0063] The synchronizer 2900 is connected to the 
decoding system controller 2300 from which it receives 40 
the system clock reference SCR contained in the syn- 
chronization control data St81 to set the internal system 
clock STC and supply the reset system clock St79 to the 
decoding system controller 2300. 

[0064] Based on this system clock St79, the decoding as 
system controller 2300 also generates the stream read 
signal St65 at a specific interval and outputs the read 
signal St65 to the stream buffer 2400. 
[0065] Based on the supplied read signal St65, the 
stream buffer 2400 outputs the reproduced bitstream so 
St61 at a specific interval to the system decoder 2500 
as bitstream St67. 

[0066] Based on the scenario selection data St51 , the 
decoding system controller 2300 generates the decod- 
ing signal St69 defining the stream Ids for the video, ss 
sub-picture, and audio bitstreams corresponding to the 
selected scenario, and outputs to the system decoder 
2500. 



[0067] Based on the instructions contained in the 
decoding signal St69, the system decoder 2500 respec- 
tively outputs the video, sub-picture, and audio bit- 
streams input from the stream buffer 2400 to the video 
buffer 2600, sub-picture buffer 2700, and audio buffer 
2800 as the encoded video stream St71 , encoded sub- 
picture stream St73, and encoded audio stream St75. 
[0068] The system decoder 2500 detects the presen- 
tation time stamp PTS and decoding time stamp DTS of 
the smallest control unit in each bitstream St67 to gen- 
erate the time information signal St77. This time infor- 
mation signal St77 is supplied to the synchronizer 2900 
through the decoding system controller 2300 as the 
synchronization control data St81 . 
[0069] Based on this synchronization control data 
St81 , the synchronizer 2900 determines the decoding 
start timing whereby each of the bitstreams will be 
arranged in the correct sequence after decoding, and 
then generates and inputs the video stream decoding 
start signal St89 to the video decoder 3800 based on 
this decoding timing. The synchronizer 2900 also gen- 
erates and supplies the sub-picture decoding start sig- 
nal St9l and audio stream decoding start signal St93 to 
the sub-picture decoder 3100 and audio decoder 3200, 
respectively. 

[0070] The video decoder 3800 generates the video 
output request signal St84 based on the video stream 
decoding start signal St89, and outputs to the video 
buffer 2600. In response to the video output request sig- 
nal St84, the video buffer 2600 outputs the video stream 
St83 to the video decoder 3800. The video decoder 
3800 thus detects the presentation time information 
contained in the video stream St83, and disables the 
video output request signal St84 when the length of the 
received video stream St83 is equivalent to the speci- 
fied presentation time. A video stream equal in length to 
the specified presentation time is thus decoded by the 
video decoder 3800. which outputs the reproduced 
video signal St 104 to the synthesizer 3500. 
[0071 ] The sub-picture decoder 31 00 similarly gener- 
ates the sub-picture output request signal St86 based 
on the sub-picture decoding start signal St91, and out- 
puts to the sub-picture buffer 2700. In response to the 
sub-picture output request signal St86. the sub-picture 
buffer 2700 outputs the sub-picture stream St85 to the 
sub-picture decoder 3100. Based on the presentation 
time information contained in the sub-picture stream 
St85, the sub-picture decoder 3100 decodes a length of 
the sub-picture stream St85 corresponding to the spec- 
ified presentation time to reproduce and supply to the 
synthesizer 3500 the sub-picture signal St99. 
[0072] The synthesizer 3500 superimposes the video 
signal St 104 and sub-picture signal St99 to generate 
and output the multi-picture video signal St105 to the 
video data output terminal 3600. 
[0073] The audio decoder 3200 generates and sup- 
plies to the audio buffer 2800 the audio output request 
signal St88 based on the audio stream decoding start 
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signal St93. The audio buffer 2800 thus outputs the 
audio stream St87 to the audio decoder 3200. The 
audio decoder 3200 decodes a length of the audio 
stream St87 corresponding to the specified presenta- 
tion time based on the presentation time information 5 
contained in the audio stream St87, and outputs the 
decoded audio stream St101 to the audio data output 
terminal 3700. 

[0074] It is thus possible to reproduce a user-defined 
multimedia bitstream MBS in real-time according to a 10 
user-defined scenario. More specifically, each time the 
user selects a different scenario, the authoring decoder 
DC is able to reproduce the title content desired by the 
user in the desired sequence by reproducing the multi- 
media bitstream MBS corresponding to the selected 15 
scenario. 

[0075] It is therefore possible by means of the author- 
ing system of the present invention to generate a multi- 
media bitstream according to plural user-defined 
scenarios by real-time or batch encoding multimedia 20 
source data in a manner whereby the substreams of the 
smallest editing units (scenes), which can be divided 
into plural substreams, expressing the basic title content 
are arranged in a specific time-base relationship. 
[0076] The multimedia bitstream thus encoded can 25 
then be reproduced according to the one scenario 
selected from among plural possible scenarios. It is also 
possible to change scenarios while playback is in 
progress, i.e., to select a different scenario and dynam- 
ically generate a new multimedia bitstream according to 30 
the most recently selected scenario. It is also possible 
to dynamically select and reproduce any of plural 
scenes while reproducing the title content according to 
a desired scenario. 

[0077] It is therefore possible by means of the author- 35 
ing system of the present invention to encode and not 
only reproduce but to repeatedly reproduce a multime- 
dia bitstream MBS in real-time. 

[0078] A detail of the authoring system is disclosed 
Japanese Patent Application filed September 27, 1996, 40 
and entitled and assigned to the same assignee as the 
present application. 

DVD 

45 

[0079] An example of a digital video disk (DVD) with 
only one recording surface (a single-sided DVD) is 
shown in Fig. 4. 

[0080] The DVD recording medium RC1 in the pre- 
ferred embodiment of the invention comprises a data so 
recording surface RS1 to and from which data is written 
and read by emitting laser beam LS, and a protective 
layer PL1 covering the data recording surface RS1. A 
backing layer BL1 is also provided on the back of data 
recording surface RS1. The side of the disk on which ss 
protective layer PL1 is provided is therefore referred to 
below as side SA (commonly "side A"), and the opposite 
side (on which the backing layer BL1 is provided) is 



referred to as side SB ("side B"). Note that digital video 
disk recording media having a single data recording sur- 
face RS1 on only one side such as this DVD recording 
medium RC1 is commonly called a single-sided single 
layer disk. 

[0081] A detailed illustration of area C1 in Fig. 4 is 
shown in Fig. 5. Note that the data recording surface 
RS1 is formed by applying a metallic thin film or other 
reflective coating as a data layer 4109 on a first trans- 
parent layer 4108 having a particular thickness T1 . This 
first transparent layer 4108 also functions as the protec- 
tive layer PL1 . A second transparent substrate 4111 of 
a thickness T2 functions as the backing layer BL1 , and 
is bonded to the first transparent layer 4108 by means of 
an adhesive layer 4110 disposed therebetween. 
[0082] A printing layer 41 12 for printing a disk label 
may also be disposed on the second transparent sub- 
strate 41 1 1 as necessary. The printing layer 41 12 does 
not usually cover the entire surface area of the second 
transparent substrate 41 1 1 (backing layer BL1), but only 
the area needed to print the text and graphics of the disk 
label. The area of second transparent substrate 41 1 1 to 
which the printing layer 41 12 is not formed may be left 
exposed. Light reflected from the data layer 4109 
(metallic thin film) forming the data recording surface 
RS1 can therefore be directly observed where the label 
is not printed when the digital video disk is viewed from 
side SB. As a result, the background looks like a silver- 
white over which the printed text and graphics float 
when the metallic thin film is an aluminum thin film, for 
example. 

[0083] Note that it is only necessary to provide the 
printing layer 4112 where needed for printing, and it is 
not necessary to provide the printing layer 4112 over the 
entire surface of the backing layer BL1 . 
[0084] A detailed illustration of area C2 in Fig. 5 is 
shown in Fig. 6. Pits and lands are molded to the com- 
mon contact surface between the first transparent layer 
4108 and the data layer 4109 on side SA from which 
data is read by emitting a laser beam LS, and data is 
recorded by varying the lengths of the pits and lands 
(i.e., the length of the intervals between the pits). More 
specifically, the pit and land configuration formed on the 
first transparent layer 4108 is transferred to the data 
layer 4109. The lengths of the pits and lands is shorter, 
and the pitch of the data tracks formed by the pit 
sequences is narrower, than with a conventional Com- 
pact Disc (CD). The surface recording density is there- 
fore greatly improved. 

[0085] Side SA of the first transparent layer 4108 on 
which data pits are not formed is a flat surface. The sec- 
ond transparent substrate 4111 is for reinforcement, 
and is a transparent panel made from the same material 
as the first transparent layer 4108 with both sides flat. 
Thicknesses T1 and T2 are preferably equal and com- 
monly approximately 0.6 mm, but the invention shall not 
be so limited. 

[0086] As with a CD, information is read by irradiating 
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the surface with a laser beam LS and detecting the 
change in the reflectivity of the light spot. Because the 
objective lens aperture NA can be large and the wave- 
length I of the light beam small in a digital video disk 
system, the diameter of the light spot Ls used can be 
reduced to approximately 1/1.6 the light spot needed to 
read a CD. Note that this means the resolution of the 
laser beam LS in the DVD system is approximately 1 .6 
times the resolution of a conventional CD system. 
[0087] The optical system used to read data from the 
digital video disk uses a short 650 nm wavelength red 
semiconductor laser and an objective lens with a 0.6 
mm aperture NA. By thus also reducing the thickness T 
of the transparent panels to 0.6 mm, more than 5 GB of 
data can be stored to one side of a 120 mm diameter 
optical disk. 

[0088] It is therefore possible to store motion picture 
(video) images having an extremely large per unit data 
size to a digital video disk system disk without losing 
image quality because the storage capacity of a single- 
sided, single-layer recording medium RC1 with one data 
recording surface RS1 as thus described is nearly ten 
times the storage capacity of a conventional CD. As a 
result, while the video presentation time of a conven- 
tional CD system is approximately 74 minutes if image 
quality is sacrificed, high quality video images with a 
video presentation time exceeding two hours can be 
recorded to a DVD. 

[0089] The digital video disk is therefore well-suited as 
a recording medium for video images. 
[0090] A digital video disk recording medium with plu- 
ral recording surfaces RS as described above is shown 
in Figs. 7 and 8. The DVD recording medium RC2 
shown in Fig. 7 comprises two recording surfaces, i.e., 
first recording surface RS1 and semi-transparent sec- 
ond recording surface RS2, on the same side, i.e. side 
SA, of the disk. Data can be simultaneously recorded or 
reproduced from these two recording surfaces by using 
different laser beams LS1 and LS2 for the first recording 
surface RS1 and the second recording surface RS2. It is 
also possible to read/write both recording surfaces RSI 
and RS2 using only one of the laser beams LS1 or LS2. 
Note that recording media thus comprised are called 
"single-side, dual-layer disks." 

[0091 ] It should also be noted that while two recording 
surfaces RS1 and RS2 are provided in this example, it 
is also possible to produce digital video disk recording 
media having more than two recording surfaces RS. 
Disks thus comprised are known as "single-sided, multi- 
layer disks." 

[0092] Though comprising two recording surfaces 
similarly to the recording media shown in Fig. 7, the 
DVD recording medium RC3 shown in Fig. 8 has the 
recording surfaces on opposite sides of the disk, i. e„ 
has the first data recording surface RS1 on side SA and 
the second data recording surface RS2 on side SB. it 
will also be obvious that while only two recording sur- 
faces are shown on one digital video disk in this exam- 



ple, more than two recording surfaces may also be 
formed on a double-sided digital video disk. As with the 
recording medium shown in Fig. 7, it is also possible to 
provide two separate laser beams LS1 and LS2 for 

5 recording surfaces RS1 and RS2, or to read/write both 
recording surfaces RS1 and RS2 using a single laser 
beam. Note that this type of digital video disk is called a 
"double-sided, dual-layer disk." It will also be obvious 
that a double-sided digital video disk can be comprised 

w with two or more recording surfaces per side. This type 
of disk is called a "double-sided, multi-layer disk." 
[0093] A plan view from the laser beam LS irradiation 
side of the recording surface RS of the DVD recording 
medium RC is shown in Fig. 9 and Fig. 10. Note that a 

is continuous spiral data recording track TR is provided 
from the inside circumference to the outside circumfer- 
ence of the DVD. The data recording track TR is divided 
into plural sectors each having the same known storage 
capacity. Note that for simplicity only the data recording 

20 track TR is shown in Fig. 9 with more than three sectors 
per revolution. 

[0094] As shown in Fig. 9, the data recording track TR 
is normally formed clockwise inside to outside (see 
arrow DrA) from the inside end point I A at the inside cir- 

25 cumference of disk RCA to the outside end point OA at 
the outside circumference of the disk with the disk RCA 
rotating counterclockwise RdA. This type of disk RCA is 
called a clockwise disk, and the recording track formed 
thereon is called a clockwise track TRA. 

30 [0095] Depending upon the application, the recording 
track TRB may be formed clockwise from outside to 
inside circumference (see arrow DrB in Fig. 10) from the 
outside end point OB at the outside circumference of 
disk RCB to the inside end point IB at the inside circum- 

35 ference of the disk with the disk RCB rotating clockwise 
RdB. Because the recording track appears to wind 
counterclockwise when viewed from the inside circum- 
ference to the outside circumference on disks with the 
recording track formed in the direction of arrow DrB, 

40 these disks are referred to as counterclockwise disk 
RCB with counterclockwise track TRB to distinguish 
them from disk RCA in Fig. 9. Note that track directions 
DrA and DrB are the track paths along which the laser 
beam travels when scanning the tracks for recording 

45 and playback. Direction of disk rotation RdA in which 
disk RCA turns is thus opposite the direction of track 
path DrA. and direction of disk rotation RdB in which 
disk RCB turns is thus opposite the direction of track 
path DrB. 

so [0096] An exploded view of the single-sided, dual- 
layer disk RC2 shown in Fig. 7 is shown as disk RC2o in 
Fig. 1 1 . Note that the recording tracks formed on the 
two recording surfaces run in opposite directions. Spe- 
cifically, a clockwise recording track TRA as shown in 

55 Fig. 9 is formed in clockwise direction DrA on the (lower) 
first data recording surface RS1, and a counterclock- 
wise recording track TRB formed in counterclockwise 
direction DrB as shown in Fig. 10 is provided on the 
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(upper) second data recording surface RS2. As a result, 
the outside end points OA and OB of the first and sec- 
ond (top and bottom) tracks are at the same radial posi- 
tion relative to the center axis of the disk RC2o. Note 
that track paths DrA and DrB of tracks TR are also the 5 
data read/write directions to disk RC. The first and sec- 
ond (top and bottom) recording tracks thus wind oppo- 
site each other with this disk RC, i.e., the track paths 
DrA and DrB of the top and bottom recording layers are 
opposite track paths. 10 
[0097] Opposite track path type, single-sided, dual- 
layer disks RC2o rotate in direction RdA corresponding 
to the first recording surface RS1 with the laser beam 
LS traveling along track path DrA to trace the recording 
track on the first recording surface RSI . When the laser 15 
beam LS reaches the outside end point OA, the laser 
beam LS can be refocused to end point OB on the sec- 
ond recording surface RS2 to continue tracing the 
recording track from the first to the second recording 
surface uninterrupted. The physical distance between 20 
the recording tracks TRA and TRB on the first and sec- 
ond recording surfaces RS1 and RS2 can thus be 
instantaneously eliminated by simply adjusting the 
focus of the laser beam LS. 

[0098] It is therefore possible with an opposite track 25 
path type, single-sided, dual-layer disk RC2o to easily 
process the recording tracks disposed to physically dis- 
crete top and bottom recording surfaces as a single 
continuous recording track. It is therefore also possible 
in an authoring system as described above with refer- 30 
ence to Fig. 1 to continuously record the multimedia bit- 
stream MBS that is the largest multimedia data 
management unit to two discrete recording surfaces 
RSI and RS2 on a single recording medium RC2o. 
[0099] It should be noted that the tracks on recording 35 
surfaces RSI and RS2 can be wound in the directions 
opposite those described above, i.e., the counterclock- 
wise track TRB may be provided on the first recording 
surface RSI and the clockwise track TRA on the second 
recording surface RS2. In this case the direction of disk 40 
rotation is also changed to a clockwise rotation RdB, 
thereby enabling the two recording surfaces to be used 
as comprising a single continuous recording track as 
described above. For simplification, a further example of 
this type of disk is therefore neither shown nor 4s 
described below. 

[01 00] It is therefore possible by thus constructing the 
digital video disk to record the multimedia bitstream 
MBS for a feature-length title to a single opposite track 
path type, single-sided, dual-layer disk RC2o. Note that so 
this type of digital video disk medium is called a single- 
sided dual-layer disk with opposite track paths. 
[01 01] Another example of the single-sided, dual-layer 
DVD recording medium RC2 shown in Fig. 7 is shown 
as disk RC2p in Fig. 1 2. The recording tracks formed on ss 
both first and second recording surfaces RS1 and RS2 
are clockwise tracks TRA as shown in Fig. 9. In this 
case, the single-sided, dual-layer disk RC2p rotates 



counterclockwise in the direction of arrow RdA, and the 
direction of laser beam LS travel is the same as the 
direction of the track spiral, i.e., the track paths of the 
top and bottom recording surfaces are mutually parallel 
(parallel track paths). The outside end points OA of both 
top and bottom tracks are again preferably positioned at 
the same radial position relative to the center axis of the 
disk RC2p as described above. As also described 
above with disk RC2o shown in Fig. 1 1 , the access point 
can be instantaneously shifted from outside end point 
OA of track TRA on the first recording surface RS1 to 
the outside end point OA of track TRA on the second 
recording surface RS2 by appropriately adjusting the 
focus of the laser beam LS at outside end point OA. 
[01 02] However, for the laser beam LS to continuously 
access the clockwise recording track TRA on the sec- 
ond recording surface RS2. the recording medium 
RC2p must be driven in the opposite direction (clock- 
wise, opposite direction RdA). Depending on the radial 
position of the laser beam LS, however, it is inefficient to 
change the rotational direction of the recording medium. 
As shown by the diagonal arrow in Fig. 12, the laser 
beam LS is therefore moved from the outside end point 
OA of the track on the first recording surface RS1 to the 
inside end point IA of the track on the second recording 
surface RS2 to use these physically discrete recording 
tracks as one logically continuous recording track 
[0103] Rather than using the recording tracks on top 
and bottom recording surfaces as one continuous 
recording track, it is also possible to use the recording 
tracks to record the multimedia bitstreams MBS for dif- 
ferent titles. This type of digital video disk recording 
medium is called a "single-sided, dual-layer disk with 
parallel track paths." 

[01 04] Note that if the direction of the tracks formed on 
the recording surfaces RS1 and RS2 is opposite that 
described above, i.e., counterclockwise recording tracks 
TRB are formed, disk operation remains the same as 
that described above except for the direction of disk 
rotation, which is clockwise as shown by arrow RdB. 
[0105] Whether using clockwise or counterclockwise 
recording tracks, the single-sided, dual-layer disk RC2p 
with parallel track paths thus described is well-suited to 
storing on a single disk encyclopedia and similar multi- 
media bitstreams comprising multiple titles that are fre- 
quently and randomly accessed 
[01 06] An exploded view of the dual-sided single-layer 
DVD recording medium RC3 comprising one recording 
surface layer RSI and RS2 on each side as shown in 
Fig. 8 is shown as DVD recording medium RC3s in Fig. 
13. Clockwise recording track TRA is provided on the 
one recording surface RS1, and a counterclockwise 
recording track TRB is provided on the other recording 
surface RS2. As in the preceding recording media, the 
outside end points OA and OB of the recording tracks 
on each recording surface are preferably positioned at 
the same radial position relative to the center axis of the 
DVD recording medium RC3s. 
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[0107] Note that while the recording tracks on these 
recording surfaces RS1 and RS2 rotate in opposite 
directions, the track paths are symmetrical. This type of 
recording medium is therefore known as a double-sided 
dual layer disk with symmetrical track paths. This dou- 5 
ble-sided dual layer disk with symmetrical track paths 
RC3s rotates in direction RdA when reading/writing the 
first recording surface RS1. As a result, the track path 
on the second recording surface RS2 on the opposite 
side is opposite the direction DrB in which the track w 
winds, i.e., direction DrA. Accessing both recording sur- 
faces RS1 and RS2 using a single laser beam LS is 
therefore not realistic irrespective of whether access is 
continuous or non-continuous. In addition, a multimedia 
bitstream MBS is separately recorded to the recording 15 
surfaces on the first and second sides of the disk. 
[0108] A different example of the double-sided single 
layer disk RC3 shown in Fig. 8 is shown in Fig. 14 as 
disk RC3a. Note that this disk comprises clockwise 
recording tracks TRA as shown in Fig. 9 on both record- 20 
ing surfaces RS1 and RS2. As with the preceding 
recording media, the outside end points OA and OA of 
the recording tracks on each recording surface are pref- 
erably positioned at the same radial position relative to 
the center axis of the DVD recording medium RC3a. 25 
Unlike the double-sided dual layer disk with symmetrical 
track paths RC3s described above, the tracks on these 
recording surfaces RS1 and RS2 are asymmetrical. 
This type of disk is therefore known as a double-sided 
dual layer disk with asymmetrical track paths. This dou- 30 
ble-sided dual layer disk with asymmetrical track paths 
RC3a rotates in direction RdA when reading/writing the 
first recording surface RS1. As a result, the track path 
on the second recording surface RS2 on the opposite 
side is opposite the direction DrA in which the track 35 
winds, i.e., direction DrB. 

[0109] This means that if a laser beam LS is driven 
continuously from the inside circumference to the out- 
side circumference on the first recording surface RS1 , 
and then from the outside circumference to the inside 40 
circumference on the second recording surface RS2 t 
both sides of the recording medium RC3a can be 
read/written without turning the disk over and without 
providing different laser beams for the two sides. 
[01 1 0] The track paths for recording surfaces RSI and 45 
RS2 are also the same with this double-sided dual layer 
disk with asymmetrical track paths RC3a. As a result, it 
is also possible to read/write both sides of the disk with- 
out providing separate laser beams for each side if the 
recording medium RC3a is turned over between sides, so 
and the read/write apparatus can therefore be con- 
structed economically. 

[0111] It should be noted that this recording medium 
remains functionally identical even if counterclockwise 
recording track TRB is provided in place of clockwise 55 
recording track TRA on both recording surfaces RS1 
and RS2. 

[0112] As described above, the true value of a DVD 



system whereby the storage capacity of the recording 
medium can be easily increased by using a multiple 
layer recording surface is realized in multimedia applica- 
tions whereby plural video data units, plural audio data 
units, and plural graphics data units recorded to a single 
disk are reproduced through interactive operation by the 
user. 

[0113] It is therefore possible to achieve one long- 
standing desire of software (programming) providers, 
specifically, to provide programming content such as a 
commercial movie on a single recording medium in plu- 
ral versions for different language and demographic 
groups while retaining the image quality of the original. 

Parental control 

[0114] Content providers of movie and video titles 
have conventionally had to produce, supply, and man- 
age the inventory of individual titles in multiple lan- 
guages, typically the language of each distribution 
market, and multi-rated title packages conforming to the 
parental control (censorship) regulations of individual 
countries in Europe and North America. The time and 
resources required for this are significant. While high 
image quality is obviously important, the programming 
content must also be consistently reproducible. 
[0115] The digital video disk recording medium is 
close to solving these problems. 

Multiple angles 

[01 1 6] One interactive operation widely sought in mul- 
timedia applications today is for the user to be able to 
change the position from which a scene is viewed dur- 
ing reproduction of that scene. This capability is 
achieved by means of the multiple angle function. 
[0117] This multiple angle function makes possible 
applications whereby, for example, a user can watch a 
baseball game from different angles (or virtual positions 
in the stadium), and can freely switch between the views 
while viewing is in progress. In this example of a base- 
ball game, the available angles may include a position 
behind the backstop centered on the catcher, batter, 
and pitcher; one from behind the backstop centered on 
a fielder, the pitcher, and the catcher; and one from 
center field showing the view to the pitcher and catcher. 
[0118] To meet these requirements, the digital video 
disk system uses MPEG, the same basic standard for- 
mat used with Video-Cds to record the video, audio, 
graphics, and other signal data. Because of the differ- 
ences in storage capacity, transfer rates, and signal 
processing performance within the reproduction appa- 
ratus, DVD uses MPEG2, the compression method and 
data format of which differ slightly from the MPEG1 for- 
mat used with Video-Cds. 

[01 19] It should be noted that the content of and dif- 
ferences between the MPEG1 and MPEG2 standards 
have no direct relationship to the intent of the present 
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invention, and further description is therefore omitted 
below (for more information, see MPEG specifications 
ISO-11172 and ISO-13818). 

[0120] The data structure of the DVD system accord- 
ing to the present invention is described in detail below s 
with reference to Figs. 16, 17, 18, 19, 20, and 21. 

Multi-scene control 

[0121] A fully functional and practical parental lock 10 
playback function and multi-angle scene playback func- 
tion must enable the user to modify the system output in 
minor, subtle ways while still presenting substantially 
the same video and audio output. If these functions are 
achieved by preparing and recording separate titles sat- is 
isfying each of the many possible parental lock and 
multi-angle scene playback requests, titles that are sub- 
stantially identical and differ in only minor ways must be 
recorded to the recording medium. This results in iden- 
tical data being repeatedly recorded to the larger part of 20 
the recording medium, and significantly reduces the uti- 
lization efficiency of the available storage capacity. More 
particularly, it is virtually impossible to record discrete 
titles satisfying every possible request even using the 
massive capacity of the digital video disk medium. 25 
While it may be concluded that this problem can be eas- 
ily solved by increasing the capacity of the recording 
medium, this is an obviously undesirable solution when 
the effective use of available system resources is con- 
sidered. 30 
[01 22] Using multi-scene control, the concept of which 
is described in another section below, in a DVD system, 
it is possible to dynamically construct titles for numerous 
variations of the same basic content using the smallest 
possible amount of data, and thereby effectively utilize 35 
the available system resources (recording medium). 
More specifically, titles that can be played back with 
numerous variations are constructed from basic (com- 
mon) scene periods containing data common to each 
title, and multi-scene periods comprising groups of dif- 40 
ferent scenes corresponding to the various requests. 
During reproduction, the user is able to freely and at any 
time select particular scenes from the multi-scene peri- 
ods to dynamically construct a title conforming to the 
desired content, e.g., a title omitting certain scenes 45 
using the parental lock control function. 
[01 23] Note that multi-scene control enabling a paren- 
tal lock playback control function and multi-angle scene 
playback is described in another section below with ref- 
erence to Fig. 21. so 

Data structure of the DVD system 

[0124] The data structure used in the authoring sys- 
tem of a digital video disk system according to the 55 
present invention is shown in Fig. 22. To record a multi- 
media bitstream MBS, this digital video disk system 
divides the recording medium into three major recording 



areas, the lead-in area LI, the volume space VS, and the 
lead-out area LO. 

[01 25] The lead-in area LI is provided at the inside cir- 
cumference area of the optical disk. In the disks 
described with reference to Figs. 9 and 10, the lead-in 
area LI is positioned at the inside end points IA and IB 
of each track. Data for stabilizing the operation of the 
reproducing apparatus when reading starts is written to 
the lead-in area LI. 

[0126] The lead-out area LO is correspondingly 
located at the outside circumference of the optical disk, 
i.e., at outside end points OA and OB of each track in 
the disks described with reference to Figs. 9 and 10. 
Data identifying the end of the volume space VS is 
recorded in this lead-out area LO. 
[0127] The volume space VS is located between the 
lead-in area LI and lead-out area LO, and is recorded as 
a one-dimensional array of n+1 (where n is an integer 
greater than or equal to zero) 2048-byte logic sectors 
LS. The logic sectors LS are sequentially number #0, 
#1, #2, ... #n. The volume space VS is also divided into 
a volume and file structure management area VFS and 
a file data structure area FDS. 

[0128] The volume and file structure management 
area VFS comprises m+1 logic sectors LS#0 to LS#m 
(where m is an integer greater than or equal to zero and 
less than n. The file data structure FDS comprises n-m 
logic sectors LS #m+1 to LS #n. 
[01 29] Note that this file data structure area FDS cor- 
responds to the multimedia bitstream MBS shown in 
Fig. 1 and described above. 

[01 30] The volume file structure VFS is the file system 
for managing the data stored to the volume space VS as 
files, and is divided into logic sectors LS#0 - LS#m 
where m is the number of sectors required to store all 
data needed to manage the entire disk, and is a natural 
number less than n. Information for the files stored to 
the file data structure area FDS is written to the volume 
file structure VFS according to a known specification 
such as ISO-9660 or ISO-13346. 
[01 31 ] The file data structure area FDS comprises n- 
m logic sectors LS#m - LS#n, each comprising a video 
manager VMG sized to an integer multiple of the logic 
sector (2048 x I, where I is a known integer), and k video 
title sets VTS #1 - VTS#k (where k is a natural number 
less than 100). 

[0132] The video manager VMG stores the title man- 
agement information for the entire disk, and information 
for building a volume menu used to set and change 
reproduction control of the entire volume. 
[0133] Any video title set VTS #k is also called a 
"video file" representing a title comprising video, audio, 
and/or still image data. 

[0134] The internal structure of each video title set 
VTS shown in Fig. 22 is shown in Fig. 16. Each video 
title set VTS comprises VTS information VTSI describ- 
ing the management information for the entire disk, and 
the VTS title video objects VOB (VTSTT_VOBS), i.e., 
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the system stream of the multimedia bitstream. The 
VTS information VTSI is described first below, followed 
by the VTS title VOBS. 

[0135] The VTS information primarily includes the 
VTSI management table VTSI_MAT and VTSPGC infor- 5 
mation table VTS_PGCIT. 

[0136] The VTSI management table VTSIJvlAT stores 
such information as the internal structure of the video 
title set VTS. the number of selectable audio streams 
contained in the video title set VTS. the number of sub- 10 
pictures, and the video title set VTS location (storage 
address). 

[0137] The VTSPGC information table VTS_PGCIT 
records i (where i is a natural number) program chain 
(PGC) data blocks VTS_PGCI #1 - VTS_PGCI #i for is 
controlling the playback sequence. Each of the table 
entries VTS_PGCI #i is a data entry expressing the pro- 
gram chain, and comprises j (where j is a natural 
number) cell playback information blocks C_PBI #1 - 
C_PBI #j- Each cell playback information block C_PBI #j 20 
contains the playback sequence of the cell and play- 
back control information. 

[0138] The program chain PGC is a conceptual struc- 
ture describing the story of the title content, and there- 
fore defines the structure of each title by describing the 25 
cell playback sequence. Note that these cells are 
described in detail below. 

[0139] If. for example, the video title set information 
relates to the menus, the video title set information VTSI 
is stored to a buffer in the playback device when play- 30 
back starts. If the user then presses a MENU button on 
a remote control device, for example, during playback, 
the playback device references the buffer to fetch the 
menu information and display the top menu #1 . If the 
menus are hierarchical, the main menu stored as pro- 35 
gram chain information VTS_PGC! #1 may be dis- 
played, for example, by pressing the MENU button, 
VTS_PGCI #2 - #9 may correspond to submenus 
accessed using the numeric keypad on the remote con- 
trol, and VTS_PGCI #10 and higher may correspond to 40 
additional submenus further down the hierarchy. Alter- 
natively, VTS_PGCI #1 may be the top menu displayed 
by pressing the MENU button, while VTS_PGCI #2 and 
higher may be voice guidance reproduced by pressing 
the corresponding numeric key. 45 
[0140] The menus themselves are expressed by the 
plural program chains defined in this table. As a result, 
the menus may be freely constructed in various ways, 
and shall not be limited to hierarchical or non-hierarchi- 
cal menus or menus containing voice guidance. so 
[0141] In the case of a movie, for example, the video 
title set information VTSI is stored to a buffer in the play- 
back device when playback starts, the playback device 
references the cell playback sequence described by the 
program chain PGC, and reproduces the system 55 
stream. 

[0142] The "cells" referenced here may be all or part 
of the system stream, and are used as access points 



during playback. Cells can therefore be used, for exam- 
ple, as the "chapters" into which a title may be divided. 
[0143] Note that each of the PGC information entries 
C_PBI #j contain both cell playback processing informa- 
tion and a cell information table. The cell playback 
processing information comprises the processing infor- 
mation needed to reproduce the cell, such as the pres- 
entation time and number of repetitions. More 
specifically, this information includes the cell block 
mode CBM, cell block type CBT, seamless playback flag 
SPF, interleaved allocation flag IAF, STC resetting flag 
STCDF, cell presentation time C_PBTM, seamless 
angle change flag SACF, first cell VOBU start address 
C_FVOBU_SA, and the last cell VOBU start address 
C_LVOBU_SA. 

[0144] Note that seamless playback refers to the 
reproduction in a digital video disk system of multimedia 
data including video, audio, and sub-picture data with- 
out intermittent breaks in the data or information. Seam- 
less playback is described in detail in another section 
below with reference to Fig. 23 and Fig. 24. 
[0145] The cell block mode CBM indicates whether 
plural cells constitute one functional block. The cell play- 
back information of each cell in a functional block is 
arranged consecutively in the PGC information. The cell 
block mode CBM of the first cell playback information in 
this sequence contains the value of the first cell in the 
block, and the cell block mode CBM of the last cell play- 
back information in this sequence contains the value of 
the last cell in the block. The cell block mode CBM of 
each cell arrayed between these first and last cells con- 
tains a value indicating that the cell is a cell between 
these first and last cells in that block. 
[0146] The cell block type CBT identifies the type of 
the block indicated by the cell block mode CBM. For 
example, when a multiple angle function is enabled, the 
cell information corresponding to each of the reproduci- 
ble angles is programmed as one of the functional 
blocks mentioned above, and the type of these func- 
tional blocks is defined by a value identifying "angle" in 
the cell block type CBT for each cell in that block. 
[0147] The seamless playback flag SPF simply indi- 
cates whether the corresponding cell is to be linked and 
played back seamlessly with the cell or cell block repro- 
duced immediately therebefore. To seamlessly repro- 
duce a given cell with the preceding cell or cell block, 
the seamless playback flag SPF is set to 1 in the cell 
playback information for that cell; otherwise SPF is set 
too. 

[0148] The interleaved allocation flag IAF stores a 
value identifying whether the cell exists in a contiguous 
or interleaved block. If the cell is part of an interleaved 
block, the flag IAF is set to 1 ; otherwise it is set to 0. 
[0149] The STC resetting flag STCDF identifies 
whether the system time clock STC used for synchroni- 
zation must be reset when the cell is played back; when 
resetting the system time clock STC is necessary, the 
STC resetting flag STCDF is set to 1 . 
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[0150] The seamless angle change flag SACF stores 
a value indicating whether a cell in a multi-angle period 
should be connected seamlessly at an angle change. If 
the angle change is seamless, the seamless angle 
change flag SACF is set to 1 ; otherwise it is set to 0. 
[01 51 ] The cell presentation time C_PBTM expresses 
the cell presentation time with video frame precision. 
[0152] The first cell VOBU start address 

C_FVOBU_SA is the VOBU start address of the first cell 
in a block, and is also expressed as the distance from 
the logic sector of the first cell in the VTS title VOBS 
(VTSTT_VOBS) as measured by the number of sectors. 
[0153] The last cell VOBU start address 

C_LVOBU_SA is the VOBU start address of the last cell 
in the block. The value of this address is expressed as 
the distance from the logic sector of the first cell in the 
VTS title VOBS (VTSTT_VOBS) as measured by the 
number of sectors. 

[0154] The VTS title VOBS (VTSTT_VOBS), i.e., the 
multimedia system stream data, is described next. The 
system stream data VTSTT_VOBS comprises i (where i 
is a natural number) system streams SS, each of which 
is referred to as a "video object" (VOB). Each video 
object VOB #1 - VOB #i comprises at least one video 
data block interleaved with up to a maximum eight audio 
data blocks and up to a maximum 32 sub-picture data 
blocks. 

[0155] Each video object VOB comprises q (where q 
is a natural number) cells C#1 - C#q. Each cell C com- 
prises r (where r is a natural number) video object units 
VOBU #1 - VOBU #r. 

[0156] Each video object unit VOBU comprises plural 
groups_of ^pictures GOP, and the audio and sub-pic- 
tures corresponding to the playback of said plural 
groups_of_pictures GOP. Note that the 
group_of ^pictures GOP corresponds to the video 
encoding refresh cycle. Each video object unit VOBU 
also starts with an NV pack, i.e., the control data for that 
VOBU. 

[0157] The structure of the navigation packs NV is 
described with reference to Fig. 18. 
[0158] Before describing the navigation pack NV, the 
internal structure of the video zone VZ (see Fig. 22), 
i.e., the system stream St35 encoded by the authoring 
encoder EC described with reference to Fig. 25, is 
described with reference to Fig. 17. Note that the 
encoded video stream St15 shown in Fig. 17 is the com- 
pressed one-dimensional video data stream encoded 
by the video encoder 300. The encoded audio stream 
St1 9 is likewise the compressed one-dimensional audio 
data stream multiplexing the right and left stereo audio 
channels encoded by the audio encoder 700. Note that 
the audio signal shall not be limited to a stereo signal, 
and may also be a multichannel surround-sound signal. 
[01 59] The system stream (title editing unit VOB) St35 
is a one dimensional array of packs with a byte size cor- 
responding to the logic sectors LS #n having a 2048- 
byte capacity as described using Fig. 21 . A stream con- 



trol pack is placed at the beginning of the title editing 
unit (VOB) St35, i.e., at the beginning of the video object 
unit VOBU. This stream control pack is called the "navi- 
gation pack NV, and records the data arrangement in 

5 the system stream and other control information. 

[0160] The encoded video stream St15 and the 
encoded audio stream St 19 are packetized in byte units 
corresponding to the system stream packs. These pack- 
ets are shown in Fig. 17 as packets V1, V2, V3, V4... 

io and A1, A2, A3.... As shown in Fig. 17, these packets 
are interleaved in the appropriate sequence as system 
stream St35, thus forming a packet stream, with consid- 
eration given to the decoder buffer size and the time 
required by the decoder to expand the video and audio 

is data packets. In the example shown in Fig. 17, the 
packet stream is interleaved in the sequence V1 , V2, 
A1, V3. V4, A2.... 

[0161] Note that the sequence shown in Fig. 17 inter- 
leaves one video data unit with one audio data unit. Sig- 

20 nificantly increased recording/playback capacity, high 
speed recording/playback, and performance improve- 
ments in the signal processing LSI enable the DVD sys- 
tem to record plural audio data and plural sub-picture 
data (graphics data) to one video data unit in a single 

25 interleaved MPEG system stream, and thereby enable 
the user to select the specific audio data and sub-pic- 
ture data to be reproduced during playback. The struc- 
ture of the system stream used in this type of DVD 
system is shown in Fig. 18 and described below. 

30 [0162] As in Fig. 17, the packetized encoded video 
stream St15 is shown in Fig. 18 as V1 , V2, V3, V4, ... In 
this example, however, there is not just one encoded 
audio stream St 19, but three encoded audio streams 
St19A, St19B, and Stl9C input as the source data. 

35 There are also two encoded sub-picture streams St1 7A 
and Sti7B input as the source data sub-picture 
streams. These six compressed data streams, St15, 
St19A, St19B, St19C, St17A and St17B, are interleaved 
to a single system stream St35. 

40 [0163] The video data is encoded according to the 
MPEG specification with the group_pf_pictures GOP 
being the unit of compression. In general, each 
group_pf_pictures GOP contains 15 frames in the case 
of an NTSC signal, but the specific number of frames 

45 compressed to one GOP is variable. The stream man- 
agement pack, which describes the management data 
containing, for example, the relationship between inter- 
leaved data, is also interleaved at the GOP unit interval. 
Because the group_of_pictures GOP unit is based on 

so the video data, changing the number of video frames 
per GOP unit changes the interval of the stream man- 
agement packs. This interval is expressed in terms of 
the presentation time on the digital video disk within a 
range from 0.4 sec. to 1.0 sec. referenced to the GOP 

55 unit. If the presentation time of contiguous plural GOP 
units is less than 1 sec., the management data packs for 
the video data of the plural GOP units is interleaved to a 
single stream. 



15 



29 



EP 0 930 782 A1 



30 



[01 64] These management data packs are referred to 
as navigation packs NV in the digital video disk system. 
The data from one navigation pack NV to the packet 
immediately preceding the next navigation pack NV 
forms one video object unit VOBU. In general, one con- 
tiguous playback unit that can be defined as one scene 
is called a video object VOB, and each video object 
VOB contains plural video object units VOBU. Data sets 
of plural video objects VOB form a VOB set (VOBS). 
Note that these data units were first used in the digital 
video disk. 

[0165] When plural of these data streams are inter- 
leaved, the navigation packs NV defining the relation- 
ship between the interleaved packs must also be 
interleaved at a defined unit known as the pack number 
unit. Each group_of_pictures GOP is normally a unit 
containing approximately 0.5 sec. of video data, which 
is equivalent to the presentation time required for 12 - 
15 frames, and one navigation pack NV is generally 
interleaved with the number of data packets required for 
this presentation time. 

[0166] The stream management information con- 
tained in the interleaved video, audio, and sub-picture 
data packets constituting the system stream is 
described below with reference to Fig. 19 As shown in 
Fig. 19, the data contained in the system stream is 
recorded in a format packed or packetized according to 
the MPEG2 standard. The packet structure is essen- 
tially the same for video, audio, and sub-picture data. 
One pack in the digital video disk system has a 2048 
byte capacity as described above, and contains a pack 
header PKH and one packet PES; each packet PES 
contains a packet header PTH and data block. 
[0167] The pack header PKH records the time at 
which that pack is to be sent from stream buffer 2400 to 
system decoder 2500 (see Fig. 26), i.e., the system 
clock reference SCR defining the reference time for syn- 
chronized audio-visual data playback. The MPEG 
standard assumes that the system clock reference SCR 
is the reference clock for the entire decoder operation. 
With such disk media as the digital video disk, however, 
time management specific to individual disk players can 
be used, and a reference clock for the decoder system 
is therefore separately provided. 
[0168] The packet header PTH similarly contains a 
presentation time stamp PTS and a decoding time 
stamp DTS. both of which are placed in the packet 
before the access unit (the decoding unit). The presen- 
tation time stamp PTS defines the time at which the 
video data or audio data contained in the packet should 
be output as the playback output after being decoded, 
and the decoding time stamp DTS defines the time at 
which the video stream should be decoded. Note that 
the presentation time stamp PTS effectively defines the 
display start timing of the access unit, and the decoding 
time stamp DTS effectively defines the decoding start 
timing of the access unit. If the PTS and DTS are the 
same time, the DTS is omitted. 



[0169] The packet header PTH also contains an 8-bit 
field called the stream ID identifying the packet type, 
i.e., whether the packet is a video packet containing a 
video data stream, a private packet, or an MPEG audio 
5 packet. 

[01 70] Private packets under the MPEG2 standard are 
data packets of which the content can be freely defined. 
Private packet 1 in this embodiment of the invention is 
used to carry audio data other than the MPEG audio 
io data, and sub-picture data; private packet 2 carries the 
PCI packet and DSI packet. 

[0171] Private packets 1 and 2 each comprise a 
packet header, private data area, and data area. The 
private data area contains an 8-bit sub-stream ID indi- 
15 eating whether the recorded data is audio data or sub- 
picture data. The audio data defined by private packet 2 
may be defined as any of eight types #0 - #7 of linear 
PCM or AC-3 encoded data. Sub-picture data may be 
defined as one of up to 32 types #0 - #31 . 

20 [0172] The data area is the field to which data com- 
pressed according to the MPEG2 specification is written 
if the stored data is video data; linear PCM. AC-3, or 
MPEG encoded data is written if audio data is stored; or 
graphics data compressed by runlength coding is wrrt- 

25 ten if sub-picture data is stored. 

[01 73] MPEG2 -compressed video data may be com- 
pressed by constant bit rate (CBR) or variable bit rate 
(VBR) coding. With constant bit rate coding, the video 
stream is input continuously to the video buffer at a con- 

30 stant rate. This contrasts with variable bit rate coding in 
which the video stream is input intermittently to the 
video buffer, thereby making it possible to suppress the 
generation of unnecessary code. Both constant bit rate 
and variable bit rate coding can be used in the digital 

35 video disk system. 

[01 74] Because MPEG video data is compressed with 
variable length coding, the data quantity in each 
group_of_pictures GOP is not constant. The video and 
audio decoding times also differ, and the time-base rela- 

40 tionship between the video and audio data read from an 
optical disk, and the time-base relationship between the 
video and audio data output from the decoder, do not 
match. The method of time-base synchronizing the 
video and audio data is therefore described in detail 

45 below with reference to Fig. 26, but is described briefly 
below based on constant bit rate coding. 
[0175] The navigation pack NV structure is shown in 
Fig. 20. Each navigation pack NV starts with a pack 
header PKH, and contains a PCI packet and DSI 

so packet. 

[0176] As described above, the pack header PKH 
records the time at which that pack is to be sent from 
stream buffer 2400 to system decoder 2500 (see Fig. 26 
), i.e., the system clock reference SCR defining the ref- 
55 erence time for synchronized audio-visual data play- 
back. 

[0177] Each PCI packet contains PCI General Infor- 
mation (PCI_GI) and Angle Information for Non-seam- 
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less playback (NMSL_AGLI). 

[0178] The PCI General Information (PCI_GI) 
declares the display time of the first video frame (the 
Start PTM of VOBU (VOBU_S_PTM)), and the display 
time of the last video frame (End PTM of VOBU s 
(VOBU_E_PTM)), in the corresponding video object 
unit VOBU with system clock precision (90 Khz). 
[0179] The Angle Information for Non-seamless play- 
back (NMSL_AGLI) states the read start address of the 
corresponding video object unit VOBU when the angle 10 
is changed expressed as the number of sectors from 
the beginning of the video object VOB. Because there 
are nine or fewer angles in this example, there are nine 
angle address declaration cells: Destination Address of 
Angle Cell #1 for Non-seamless playback is 
(NMSL_AGL_C1_DSTA) to Destination Address of 
Angle Cell #9 for Non-seamless playback 
(NMSL_AGL_C9JDSTA). 

[0180] Each DSI packet contains DSI General Infor- 
mation (DSI_GI), Seamless Playback Information 20 
(SML_PBI), and Angle Information for Seamless play- 
back (SML_AGLI). 

[0181] The DSI General Information (DSI_GI) 
declares the address of the last pack in the video object 
unit VOBU, i. e., the End Address for VOB (VOBU_EA), 25 
expressed as the number of sectors from the beginning 
of the video object unit VOBU. 

[0182] While seamless playback is described in detail 
later, it should be noted that the continuously read data 
units must be interleaved (multiplexed) at the system 30 
stream level as an interleaved unit ILVU in order to 
seamlessly reproduce split or combined titles. Plural 
system streams interleaved with the interleaved unit 
ILVU as the smallest unit are defined as an interleaved 
block. 35 
[0183] The Seamless Playback Information 

(SML_PBI) is dedared to seamlessly reproduce the 
stream interleaved with the interleaved unit ILVU as the 
smallest data unit, and contains an Interleaved Unit 
Flag (ILVU flag) identifying whether the corresponding 40 
video object unit VOBU is an interleaved block. The 
ILVU flag indicates whether the video object unit VOBU 
is in an interleaved block, and is set to 1 when it is. Oth- 
erwise the ILVU flag is set to 0. 

[0184] When a video object unit VOBU is in an inter- 45 
leaved block, a Unit END flag is declared to indicate 
whether the video object unit VOBU is the last VOBU in 
the interleaved unit ILVU. Because the interleaved unit 
ILVU is the data unit for continuous reading, the Unit 
END flag is set to 1 if the VOBU currently being read is so 
the last VOBU in the interleaved unit ILVU. Otherwise 
the Unit END flag is set to 0. 

[0185] An Interleaved Unit End Address (ILVU_EA) 
identifying the address of the last pack in the ILVU to 
which the VOBU belongs, and the starting address of ss 
the next interleaved unit ILVU, Next Interleaved Unit 
Start Address (NT_ILVU_SA), are also declared when a 
video object unit VOBU is in an interleaved block. Both 



the Interleaved Unit End Address (ILVU_EA) and Next 
interleaved Unit Start Address (NTJLVU_SA) are 
expressed as the number of sectors from the navigation 
pack NV of that VOBU. 

[0186] When two system streams are seamlessly con- 
nected but the audio components of the two system 
streams are not contiguous, particularly immediately 
before and after the seam, it is necessary to pause the 
audio output to synchronize the audio and video compo- 
nents of the system stream following the seam. Note 
that non-contiguous audio may result from different 
audio signals being recording with the corresponding 
video blocks. With an NTSC signal, for example, the 
video frame cycle is approximately 33. 33 msec while 
the AC-3 audio frame cycle is 32 msec. 
[0187] To enable this ^synchronization, audio repro- 
duction stopping times 1 and 2, i.e., Audio Stop PTM 1 
in VOB (VOB_A_STP_PTM1), and Audio Stop PTM2 in 
VOB (VOB_A_STP_PTM2), indicating the time at which 
the audio is to be paused; and audio reproduction stop- 
ping periods 1 and 2, i.e.. Audio Gap Length 1 in VOB 
(VOB_AJ3AP_LEN1) and Audio Gap Length 2 in VOB 
(VOB_A_GAP_LEN2), indicating for how long the audio 
is to be paused, are also declared in the DSI packet. 
Note that these times are specified at the system clock 
precision (90 Khz). 

[0188] The Angle Information for Seamless playback 
(SML_AGLI) declares the read start address when the 
angle is changed. Note that this field is valid when 
seamless, multi-angle control is enabled. This address 
is also expressed as the number of sectors from the 
navigation pack NV of that VOBU. Because there are 
nine or fewer angles, there are nine angle address dec- 
laration cells: Destination Address of Angle Cell #1 for 
Seamless playback (SML_AGL_C1_DSTA) to Destina- 
tion Address of Angle Cell #9 for Seamless playback 
(SML_AGL_C9_DSTA). 

[01 89] Note also that each title is edited in video object 
(VOB) units. Interleaved video objects (interleaved title 
editing units) are referenced as w VOBS"; and the 
encoded range of the source data is the encoding unit. 

DVD encoder 

[01 90] A preferred embodiment of a digital video disk 
system authoring encoder ECD in which the multimedia 
bitstream authoring system according to the present 
invention is applied to a digital video disk system is 
described below and shown in Fig. 25. It will be obvious 
that the authoring encoder ECD applied to the digital 
video disk system, referred to below as a DVD encoder, 
is substantially identical to the authoring encoder EC 
shown in Fig. 2. The basic difference between these 
encoders is the replacement in the DVD encoder ECD 
of the video zone formatter 1300 of the authoring 
encoder EC above with a VOB buffer 1000 and format- 
ter 1100. It will also be obvious that the bitstream 
encoded by this DVD encoder ECD is recorded to a dig- 
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ital video disk medium M. The operation of this DVD 
encoder ECD is therefore described below in compari- 
son with the authoring encoder EC described above. 
[0191] As in the above authoring encoder EC, the 
encoding system controller 200 generates control sig- 
nals St9. St11, St13, St21, St23, St25, St33, and St39 
based on the scenario data St7 describing the user- 
defined editing instructions input from the scenario edi- 
tor 100, and controls the video encoder 300, sub-picture 
encoder 500, and audio encoder 700 in the DVD 
encoder ECD. Note that the user-defined editing 
instructions in the DVD encoder ECD are a superset of 
the editing instructions of the authoring encoder EC 
described above. 

[0192] Specifically, the user-defined editing instruc- 
tions (scenario data St7) in the DVD encoder ECD sim- 
ilarly describe what source data is selected from all or a 
subset of the source data containing plural titles within a 
defined time period, and how the selected source data 
is reassembled to reproduce the scenario (sequence) 
intended by the user. The scenario data St7 of the DVD 
encoder ECD, however, further contains such informa- 
tion as: the number of streams contained in the editing 
units, which are obtained by splitting a multi-title source 
stream into blocks at a constant time interval; the 
number of audio and sub-picture data cells contained in 
each stream, and the sub-picture display time and 
period; whether the title is a multi-rated title enabling 
parental lock control; whether the user content is 
selected from plural streams including, for example, 
multiple viewing angles; and the method of connecting 
scenes when the angle is switched among the multiple 
viewing angles. 

[0193] The scenario data St7 of the DVD encoder 
ECD also contains control information on a video object 
VOB unit basis. This information is required to encode 
the media source stream, and specifically includes such 
information as whether there are multiple angles or 
parental control features. When multiple angle viewing 
is enabled, the scenario data St7 also contains the 
encoding bit rate of each stream considering data inter- 
leaving and the disk capacity, the start and end times of 
each control, and whether a seamless connection 
should be made between the preceding and following 
streams. 

[0194] The encoding system controller 200 extracts 
this information from the scenario data St7, and gener- 
ates the encoding information table and encoding 
parameters required for encoding control. The encoding 
information table and encoding parameters are 
described with reference to Figs. 27, 28, and 29 below. 
[0195] The stream encoding data St33 contains the 
system stream encoding parameters and system 
encoding start and end timing values required by the 
DVD system to generate the VOBs. These system 
stream encoding parameters include the conditions for 
connecting one video object VOB with those before and 
after, the number of audio streams, the audio encoding 



information and audio Ids, the number of sub-pictures 
and the sub-picture Ids, the video playback starting time 
information VPTS, and the audio playback starting time 
information APTS. 
5 [01 96] The title sequence control signal St39 supplies 
the multimedia bitstream MBS formatting start and end 
timing information and formatting parameters declaring 
the reproduction control information and interleave 
information. 

10 [0197] Based on the video encoding parameter and 
encoding start/end timing signal St9, the video encoder 
300 encodes a specific part of the video stream St1 to 
generate an elementary stream conforming to the 
MPEG2 Video standard defined in ISO-13818. This ele- 

15 mentary stream is output to the video stream buffer 400 
as encoded video stream St15. 

[0198] Note that while the video encoder 300 gener- 
ates an elementary stream conforming to the MPEG2 
Video standard defined in ISO-13818, specific encoding 

20 parameters are input via the video encoding parameter 
signal St9, including the encoding start and end timing, 
bit rate, the encoding conditions for the encoding start 
and end, the material type, including whether the mate- 
rial is an NTSC or PAL video signal or telecine con- 

25 verted material, and whether the encoding mode is set 
for either open GOP or closed GOP encoding. 
[0199] The MPEG2 coding method is basically an 
interframe coding method using the correlation between 
frames for maximum signal compression, i.e., the frame 

30 being coded (the target frame) is coded by referencing 
frames before and/or after the target frame. However, 
intra-coded frames, i. e. , frames that are coded based 
solely on the content of the target frame, are also 
inserted to avoid error propagation and enable accessi- 

35 bility from mid-stream (random access). The coding unit 
containing at least one intra-coded frame ("infra-frame*') 
is called a group_of_pictures GOP. 
[0200] A group_of_pictures GOP in which coding is 
closed completely within that GOP is known as a 

40 "closed GOP." A group_of .pictures GOP containing a 
frame coded with reference to a frame in a preceding or 
following (ISO-13818 DOES NOT LIMIT P-and B-pic- 
ture CODING to referencing PAST frames) 
group_of_pictures GOP is an "open GOP" It is therefore 

45 possible to playback a closed GOP using only that GOP. 
Reproducing an open GOP, however, also requires the 
presence of the referenced GOP, generally the GOP 
preceding the open GOP. 

[0201] The GOP is often used as the access unit. For 
so example, the GOP may be used as the playback start 
point for reproducing a title from the middle, as a transi- 
tion point in a movie, or for fast-forward play and other 
special reproduction modes. High speed reproduction 
can be achieved in such cases by reproducing only the 
55 intra-frame coded frames in a GOP or by reproducing 
only frames in GOP units. 

[0202] Based on the sub-picture stream encoding 
parameter signal St11, the sub-picture encoder 500 
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encodes a specific part of the sub-picture stream St3 to 
generate a variable length coded bitstream of bit- 
mapped data. This variable length coded bitstream data 
is output as the encoded sub-picture stream St1 7 to the 
sub-picture stream buffer 600. 5 
[0203] Based on the audio encoding parameter signal 
St13, the audio encoder 700 encodes a specific part of 
the audio stream St5 to generate the encoded audio 
data. This encoded audio data may be data based on 
the MPEG1 audio standard defined in ISO-11172 and 10 
the MPEG2 audio standard defined in ISO-13818, AC-3 
audio data, or PCM (LPCM) data. Note that the meth- 
ods and means of encoding audio data according to 
these standards are known and commonly available. 
[0204] The video stream buffer 400 is connected to is 
the video encoder 300 and to the encoding system con- 
troller 200. The video stream buffer 400 stores the 
encoded video stream St15 input from the video 
encoder 300, and outputs the stored encoded video 
stream St1 5 as the time-delayed encoded video stream 20 
St27 based on the timing signal St21 supplied from the 
encoding system controller 200. 

[0205] The sub-picture stream buffer 600 is similarly 
connected to the sub-picture encoder 500 and to the 
encoding system controller 200. The sub-picture stream 25 
buffer 600 stores the encoded sub-picture stream St17 
input from the sub-picture encoder 500. and then out- 
puts the stored encoded sub-picture stream St17 as 
time-delayed encoded sub-picture stream St29 based 
on the timing signal St23 supplied from the encoding 30 
system controller 200. 

[0206] The audio stream buffer 800 is similarly con- 
nected to the audio encoder 700 and to the encoding 
system controller 200. The audio stream buffer 800 
stores the encoded audio stream St 19 input from the 35 
audio encoder 700, and then outputs the encoded audio 
stream St19 as the time-delayed encoded audio stream 
St31 based on the timing signal St25 supplied from the 
encoding system controller 200. 

[0207] The system encoder 900 is connected to the 40 
video stream buffer 400, sub-picture stream buffer 600, 
audio stream buffer 800, and the encoding system con- 
troller 200, and is respectively supplied thereby with the 
time-delayed encoded video stream St27, time-delayed 
encoded sub-picture stream St29, time-delayed 45 
encoded audio stream St31, and the system stream 
encoding parameter data St33. Note that the system 
encoder 900 is a multiplexer that multiplexes the time- 
delayed streams St27, St29, and St31 based on the 
stream encoding data St33 (timing signal) to generate so 
title editing units (VOBs) St35. 

[0208] The VOB buffer 1000 temporarily stores the 
video objects VOBs produced by the system encoder 
900. The formatter 1100 reads the delayed video 
objects VOB from the VOB buffer 1000 based on the 55 
title sequence control signal St39 to generate one video 
zone VZ, and adds the volume file structure VFS to gen- 
erate the edited multimedia stream data St43. 



[0209] The multimedia bitstream MBS St43 edited 
according to the user-defined scenario is then sent to 
the recorder 1200. The recorder 1200 processes the 
edited multimedia stream data St43 to the data stream 
St45 format of the recording medium M, and thus 
records the formatted data stream St45 to the recording 
medium M. 

DVD decoder 

[021 0] A preferred embodiment of a digital video disk 
system authoring decoder DCD in which the multimedia 
bitstream authoring system of the present invention is 
applied to a digital video disk system is described below 
and shown in Fig. 26. The authoring decoder DCD 
applied to the digital video disk system, referred to 
below as a DVD decoder DCD, decodes the multimedia 
bitstream MBS edited using the DVD encoder ECD of 
the present invention, and recreates the content of each 
title according to the user-defined scenario. It will also 
be obvious that the multimedia bitstream St45 encoded 
by this DVD encoder ECD is recorded to a digital video 
disk medium M. 

[0211] The basic configuration of the DVD decoder 
DCD according to this embodiment is the same as that 
of the authoring decoder DC shown in Fig. 3. The differ- 
ences are that a different video decoder 3801 (shown 
as 3800 in Fig. 26) is used in place of the video decoder 
3800, and a reordering buffer 3300 and selector 3400 
are disposed between the video decoder 3801 and syn- 
thesizer 3500. 

[021 2] Note that the selector 3400 is connected to the 
synchronizer 2900, and is controlled by a switching sig- 
nal St103. 

[0213] The operation of this DVD decoder DCD is 
therefore described below in comparison with the 
authoring decoder DC described above. 
[0214] As shown in Fig. 26, the DVD decoder DCD 
comprises a multimedia bitstream producer 2000, sce- 
nario selector 2100. decoding system controller 2300, 
stream buffer 2400, system decoder 2500, video buffer 
2600. sub-picture buffer 2700, audio buffer 2800, syn- 
chronizer 2900, video decoder 3801, reordering buffer 
3300, sub-picture decoder 3100, audio decoder 3200, 
selector 3400, synthesizer 3500, video data output ter- 
minal 3600, and audio data output terminal 3700. 
[0215] The bitstream producer 2000 comprises a 
recording media drive unit 2004 for driving the recording 
medium M; a reading head 2006 for reading the infor- 
mation recorded to the recording medium M and pro- 
ducing the binary read signal St57; a signal processor 
2008 for variously processing the read signal St57 to 
generate the reproduced bitstream St61 ; and a repro- 
duction controller 2002. 

[0216] The reproduction controller 2002 is connected 
to the decoding system controller 2300 from which the 
multimedia bitstream reproduction control signal St53 is 
supplied, and in turn generates the reproduction control 
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signals St55 and St59 respectively controlling the 
recording media drive unit (motor) 2004 and signal 
processor 2008. 

[0217] So that the user-defined video, sub-picture, 
and audio portions of the multimedia title edited by the s 
authoring encoder EC are reproduced, the authoring 
decoder DC comprises a scenario selector 2100 for 
selecting and reproducing the corresponding scenes 
(titles). The scenario selector 2100 then outputs the 
selected titles as scenario data to the DVD decoder w 
DCD. 

[0218] The scenario selector 2100 preferably com- 
prises a keyboard, CPU, and monitor. Using the key- 
board, the user then inputs the desired scenario based 
on the content of the scenario input by the DVD encoder 15 
ECD. Based on the keyboard input, the CPU generates 
the scenario selection data St51 specifying the selected 
scenario. The scenario selector 2100 is connected to 
the decoding system controller 2300 by an infrared 
communications device, for example, and inputs the 20 
generated scenario selection data St51 to the decoding 
system controller 2300. 

[0219] The stream buffer 2400 has a specific buffer 
capacity used to temporarily store the reproduced bit- 
stream St61 input from the bitstream producer 2000, 2s 
extract the volume file structure VFS, the initial synchro- 
nization data SCR (system clock reference) in each 
pack, and the VOBU control information (DSI) in the 
navigation pack NV, to generate the bitstream control 
data St63. The stream buffer 2400 is also connected to 30 
the decoding system controller 2300, to which it sup- 
plies the generated bitstream control data St63. 
[0220] Based on the scenario selection data St51 sup- 
plied by the scenario selector 2100, the decoding sys- 
tem controller 2300 then generates the bitstream 35 
reproduction control signal St53 controlling the opera- 
tion of the bitstream producer 2000. The decoding sys- 
tem controller 2300 also extracts the user-defined 
playback instruction data from the bitstream reproduc- 
tion control signal St53, and generates the decoding 40 
information table required for decoding control. This 
decoding information table is described further below 
with reference to Figs. 58 and 59. The decoding system 
controller 2300 also extracts the title information 
recorded to the optical disk M from the file data struc- 45 
ture area FDS of the bitstream control data St63 to gen- 
erate the title information signal St200. Note that the 
extracted title information includes the video manager 
VMG, VTS information VTSI, the PGC information 
entries C_PBI #j, and the cell presentation time so 
C_PBTM. 

[0221] Note that the bitstream control data St63 is 
generated in pack units as shown in Fig. 19, and is sup- 
plied from the stream buffer 2400 to the decoding sys- 
tem controller 2300, to which the stream buffer 2400 is ss 
connected. 

[0222] The synchronizer 2900 is connected to the 
decoding system controller 2300 from which it receives 



the system clock reference SCR contained in the syn- 
chronization control data St81 to set the internal system 
clock STC and supply the reset system clock St79 to the 
decoding system controller 2300. 
[0223] Based on this system clock St79, the decoding 
system controller 2300 also generates the stream read 
signal St65 at a specific interval and outputs the read 
signal St65 to the stream buffer 2400. Note that the read 
unit in this case is the pack. 

[0224] The method of generating the stream read sig- 
nal St65 is described next. 

[0225] The decoding system controller 2300 com- 
pares the system clock reference SCR contained in the 
stream control data extracted from the stream buffer 
2400 with the system clock St79 supplied from the syn- 
chronizer 2900, and generates the read request signal 
St65 when the system clock St79 is greater than the 
system clock reference SCR of the bitstream control 
data St63. Pack transfers are controlled by executing 
this control process on a pack unit. 
[0226] Based on the scenario selection data St51 , the 
decoding system controller 2300 generates the decod- 
ing signal St69 defining the stream Ids for the video, 
sub-picture, and audio bitstreams corresponding to the 
selected scenario, and outputs to the system decoder 
2500. 

[0227] When a title contains plural audio tracks, e.g. 
audio tracks in Japanese, English, French, and/or other 
languages, and plural sub-picture tracks for subtitles in 
Japanese, English, French, and/or other languages, for 
example, a discrete ID is assigned to each of the lan- 
guage tracks. As described above with reference to Fig. 
1 9. a stream ID is assigned to the video data and MPEG 
audio data, and a substream ID is assigned to the sub- 
picture data, AC-3 audio data, linear PCM data, and 
navigation pack NV information. While the user need 
never be aware of these ID numbers, the user can 
select the language of the audio and/or subtitles using 
the scenario selector 2100. If English language audio is 
selected, for example, the ID corresponding to the Eng- 
lish audio track is sent to the decoding system controller 
2300 as scenario selection data St51. The decoding 
system controller 2300 then adds this ID to the decod- 
ing signal St69 output to the system decoder 2500. 
[0228] Based on the instructions contained in the 
decoding signal St69. the system decoder 2500 respec- 
tively outputs the video, sub-picture, and audio bit- 
streams input from the stream buffer 2400 to the video 
buffer 2600, sub-picture buffer 2700, snd audio buffer 
2800 as the encoded video stream St71, encoded sub- 
picture stream St73, and encoded audio stream St75. 
Thus, when the stream ID input from the scenario selec- 
tor 2100 and the pack ID input from the stream buffer 
2400 match, the system decoder 2500 outputs the cor- 
responding packs to the respective buffers (i.e., the 
video buffer 2600, sub-picture buffer 2700. and audio 
buffer 2800). 

[0229] The system decoder 2500 detects the presen- 
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tation time stamp PTS and decoding time stamp DTS of 
the smallest control unit in each bitstream St67 to gen- 
erate the time information signal St77. This time infor- 
mation signal St77 is supplied to the synchronizer 2900 
through the decoding system controller 2300 as the 5 
synchronization control data St81. 
[0230] Based on this synchronization control data 
St81, the synchronizer 2900 determines the decoding 
start timing whereby each of the bitstreams will be 
arranged in the correct sequence after decoding, and 10 
then generates and inputs the video stream decoding 
start signal St89 to the video decoder 3801 based on 
this decoding timing. The synchronizer 2900 also gen- 
erates and supplies the sub-picture decoding start sig- 
nal St9l and audio stream decoding start signal St93 to is 
the sub-picture decoder 3100 and audio decoder 3200, 
respectively. 

[0231] The video decoder 3801 generates the video 
output request signal St84 based on the video stream 
decoding start signal St89, and outputs to the video 20 
buffer 2600. In response to the video output request sig- 
nal St84, the video buffer 2600 outputs the video stream 
St83 to the video decoder 3801. The video decoder 
3801 thus detects the presentation time information 
contained in the video stream St83, and disables the 25 
video output request signal St84 when the length of the 
received video stream St83 is equivalent to the speci- 
fied presentation time. A video stream equal in length to 
the specified presentation time is thus decoded by the 
video decoder 3801, which outputs the reproduced 30 
video signal St95 to the reordering buffer 3300 and 
selector 3400. 

[0232] Because the encoded video stream is coded 
using the interframe correlations between pictures, the 
coded order and display order do not necessarily match 35 
on a frame unit basis. The video cannot, therefore, be 
displayed in the decoded order. The decoded frames 
are therefore temporarily stored to the reordering buffer 
3300. The synchronizer 2900 therefore controls the 
switching signal St103 so that the reproduced video sig- 40 
nal St95 output from the video decoder 3800 and the 
reordering buffer output St97 are appropriately selected 
and output in the display order to the synthesizer 3500. 
[0233] The sub-picture decoder 3 1 00 similarly gener- 
ates the sub-picture output request signal St86 based 45 
on the sub-picture decoding start signal St91 , and out- 
puts to the sub-picture buffer 2700. In response to the 
sub-picture output request signal St86, the sub-picture 
buffer 2700 outputs the sub-picture stream St85 to the 
sub-picture decoder 3100. Based on the presentation so 
time information contained in the sub-picture stream 
St85, the sub-picture decoder 3100 decodes a length of 
the sub-picture stream St85 corresponding to the spec- 
ified presentation time to reproduce and supply to the 
synthesizer 3500 the sub-picture signal St99. 55 
[0234] The synthesizer 3500 superimposes the selec- 
tor 3400 output with the sub-picture signal St99 to gen- 
erate and output the video signal St105 to the video 



data output terminal 3600. 

[0235] The audio decoder 3200 generates and sup- 
plies to the audio buffer 2800 the audio output request 
signal St88 based on the audio stream decoding start 
signal St93. The audio buffer 2800 thus outputs the 
audio stream St87 to the audio decoder 3200. The 
audio decoder 3200 decodes a length of the audio 
stream St87 corresponding to the specified presenta- 
tion time based on the presentation time information 
contained in the audio stream St87, and outputs the 
decoded audio stream St101 to the audio data output 
terminal 3700. 

[0236] It is thus possible to reproduce a user-defined 
multimedia bitstream MBS in real-time according to a 
user-defined scenario. More specifically, each time the 
user selects a different scenario, the DVD decoder DCD 
is able to reproduce the title content desired by the user 
in the desired sequence by reproducing the multimedia 
bitstream MBS corresponding to the selected scenario. 
[0237] It should be noted that the decoding system 
controller 2300 may supply the title information signal 
St200 to the scenario selector 2100 by means of the 
infrared communications device mentioned above or 
another means. Interactive scenario selection control- 
led by the user can also be made possible by the sce- 
nario selector 2100 extracting the title information 
recorded to the optical disk M from the file data struc- 
ture area FDS of the bitstream control data St63 con- 
tained in the title information signal St200, and 
displaying this title information on a display for user 
selection. 

[0238] Note, further, that the stream buffer 2400, video 
buffer 2600, sub-picture buffer 2700, audio buffer 2800, 
and reordering buffer 3300 are expressed above and in 
the figures as separate entities because they are func- 
tionally different. It will be obvious, however, that a sin- 
gle buffer memory can be controlled to provide the 
same discrete functionality by time-share controlled use 
of a buffer memory with an operating speed plural times 
faster than the read and write rates of these separate 
buffers. 

Multi-scene control 

[0239] The concept of multiple angle scene control 
according to the present invention is described below 
with reference to Fig. 21 . As described above, titles that 
can be played back with numerous variations are con- 
structed from basic scene periods containing data com- 
mon to each title, and multi-scene periods comprising 
groups of different scenes corresponding to the various 
scenario requests. In Fig. 21 , scenes 1,5, and 8 are the 
common scenes of the basic scene periods. The multi- 
angle scenes (angles 1, 2. and 3) between scenes 1 
and 5, and the parental locked scenes (scenes 6 and 7) 
between scenes 5 and 8, are the multi-scene periods. 
[0240] Scenes taken from different angles, i.e., angles 
1 , 2, and 3 in this example, can be dynamically selected 
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and reproduced during playback in the multi-angle 
scene period. In the parental locked scene period, how- 
ever, only one of the available scenes, scenes 6 and 7, 
having different content can be selected, and must be 
selected statically before playback begins. 
[0241] Which of these scenes from the multi-scene 
periods is to be selected and reproduced is defined by 
the user operating the scenario selector 2100 and 
thereby generating the scenario selection data St51. In 
scenario 1 in Fig. 21 the user can freely select any of the 
multi-angle scenes, and scene 6 has been preselected 
for output in the parental locked scene period. Similarly 
in scenario 2, the user can freely select any of the multi- 
angle scenes, and scene 7 has been preselected for 
output in the parental locked scene period. 
[0242] With reference to Figs. 30 and 31 , furthermore, 
the contents of the program chain information 
VTS_PGCI is described. In Fig. 30, the case that a sce- 
nario requested by the user is shown with respect to a 
VTSI data construction. The scenario 1 and scenario 2 
shown in Fig. 21 are described as program chain infor- 
mation VTS_PGC#1 and VTS_PGC#2. VTS_PGC#1 
describing the scenario 1 consists of cell playback infor- 
mation C_PBI#1 corresponding to scene 1, C_PBI#2, 
C_PBI#3. and C_PBI#4 within a multi-angle cell block, 
C_PBI#5 corresponding to scene 5, C_PBI#6 corre- 
sponding to scene 6, and C_PBI#7 corresponding to 
scene 8. 

[0243] VTS_PGCI#2 describing the scenario 2 con- 
sists of cell playback information C__PBI#1 correspond- 
ing to scene 1 , C_PBI#2, C_PBI#3, and C_PBI#4 within 
a multi-angle cell block corresponding to a multi-angle 
scene, C_PBI#5 corresponding to scene 5, C_PBI#6 
corresponding to scene 7, and C_PBI#7* corresponding 
to scene 8. According to the digital video system data 
structure, a scene which is a control unit of a scenario is 
described as a cell which is a unit thereunder, thus a 
scenario requested by a user can be obtained. 
[0244] In Fig. 31 , the case that a scenario requested 
by the user shown in Fig. 21 is shown with respect to a 
VOB data construction VTSTT_VOBS. As specifically 
shown in Fig. 31, the two scenarios 1 and 2 use the 
same VOB data in common. With respect to a single 
scene commonly owned by each scenario, VOB#1 cor- 
responding to scene 1 , VOB#5 corresponding to scene 
5, and VOB#8 corresponding to scene 8 are arranged in 
non-interleaved block which is the contiguous block. 
[0245] With respect to the multi-angle data commonly 
owned by scenarios 1 and 2, or e angle scene data is 
constructed by a single VOB. Specifically speaking, 
angle 1 is constructed by VOB#2, and angle 2 is con- 
structed by VOB#3, angle 3 is constructed by VOB#4. 
Thus constructed multi-angle data is formed as the 
interleaved block for the sake of switching between each 
angle and seamless reproduction of each angle data. 
Scenes 6 and 7 peculiar to scenarios 1 and 2, respec- 
tively, are formed as the interleaved block for the sake of 
seamless reproduction between common scenes 



before and behind thereof as well as seamless repro- 
duction between each scene. 

[0246] As described in the above, the user's request- 
ing scenario shown in Fig. 21 can be realized by utilizing 
5 the video title playback control information shown in Fig. 
30 and the title playback VOB data structure shown in 
Fig. 31. 

Seamless playback 

10 

[0247] The seamless playback capability briefly men- 
tioned above with regard to the digital video disk system 
data structure is described below. Note that seamless 
playback refers to the reproduction in a digital video disk 

15 system of multimedia data including video, audio, and 
sub-picture data without intermittent breaks in the data 
or information between basic scene periods, between 
basic scene periods and multi-scene periods, and 
between multi-scene periods. 

20 [0248] Hardware factors contributing to intermittent 
playback of this data and title content include decoder 
underflow, i.e., an imbalance between the source data 
input speed and the decoding speed of the input source 
data. 

25 [0249] Other factors relate to the properties of the 
playback data. When the playback data is data that 
must be continuously reproduced for a constant time 
unit in order for the user to understand the content or 
information, e.g., audio data, data continuity is lost 

30 when the required continuous presentation time cannot 
be assured. Reproduction of such information whereby 
the required continuity is assured is referred to as "con- 
tiguous information reproduction," or "seamless infor- 
mation reproduction." Reproduction of this information 

35 when the required continuity cannot be assured is 
referred to as "non-continuous information reproduc- 
tion," or "non-seamless information reproduction." It is 
obvious that continuous information reproduction and 
non-continuous information reproduction are, respec- 

40 tively, seamless and non-seamless reproduction. 

[0250] Note that seamless reproduction can be further 
categorized as seamless data reproduction and seam- 
less information reproduction. Seamless data reproduc- 
tion is defined as preventing physical blanks or 

45 interruptions in the data playback (intermittent repro- 
^ duction) as a result of a buffer underflow state, for exam- 
ple. Seamless information reproduction is defined as 
preventing apparent interruptions jn the information 
when perceived by the user (intermittent presentation) 

so when recognizing information from the playback data 
where there are no actual physical breaks in the data 
reproduction. 

Details of Seamless playback 

55 

[0251] The specific method enabling seamless repro- 
duction as thus described is described later below with 
reference to Figs. 23 and 24. 
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Interleaving 

[0252] The DVD data system streams described 
above are recorded using an appropriate authoring 
encoder EC as a movie or other multimedia title on a s 
DVD recording medium. Note that the following descrip- 
tion refers to a movie as the multimedia title being proc- 
essed, but it will be obvious that the invention shall not 
be so limited. 

[0253] Supplying a single movie in a format enabling 10 
the movie to be used in plural different cultural regions 
or countries requires the script to be recorded in the var- 
ious languages used in those regions or countries. It 
may even necessitate editing the content to conform to 
the mores and moral expectations of different cultures, is 
Even using such a large-capacity storage system as the 
DVD system, however, it is necessary to reduce the bit 
rate, and therefore the image quality, if plural full-length 
titles edited from a single common source title are 
recorded to a single disk. This problem can be solved by 20 
recording the common parts of plural titles only once, 
and recording the segments different in each title for 
each different title only. This method makes it possible 
to record plural titles for different countries or cultures to 
a single optical disk without reducing the bit rate, and, 25 
therefore, retaining high image quality. 
[0254] As shown in Fig. 21 , the titles recorded to a sin- 
gle optical disk contain basic scene periods of scenes 
common to all scenarios, and multi-scene periods con- 
taining scenes specific to certain scenarios, to provide 30 
parental lock control and multi-angle scene control func- 
tions. 

[0255] In the case of the parental lock control function, 
titles containing sex scenes, violent scenes, or other 
scenes deemed unsuitable for children, i.e., so-called 35 
"adult scenes," are recorded with a combination of com- 
mon scenes, adult scenes, and children's scenes. 
These title streams are achieved by arraying the adult 
and children's scenes to multi-scene periods between 
the common basic scene periods. 40 
[0256] Multi-angle control can be achieved in a con- 
ventional single-angle title by recording plural multime- 
dia scenes obtained by recording the subjects from the 
desired plural camera angles to the multi-scene periods 
arrayed between the common basic scene periods. 45 
Note, however, that while these plural scenes are 
described here as scenes recorded from different cam- 
era angles (positions), it will be obvious that the scenes 
may be recorded from the same camera angle but at dif- 
ferent times, data generated by computer graphics, or so 
other video data. 

[0257] When data is shared between different scenar- 
ios of a single title, it is obviously necessary to move the 
laser beam LS from the common scene data to the non- 
common scene data during reproduction, i.e., to move 55 
the optical pickup to a different position on the DVD 
recording medium RC1. The problem here is that the 
time required to move the optical pickup makes it diffi- 



cult to continue reproduction without creating breaks in 
the audio or video, i.e., to sustain seamless reproduc- 
tion. This problem can be theoretically solved by provid- 
ing a track buffer (stream buffer 2400) to delay data 
output an amount equivalent to the worst access time. 
In general, data recorded to an optical disk is read by 
the optical pickup, appropriately processed, and tempo- 
rarily stored to the track buffer. The stored data is sub- 
sequently decoded and reproduced as video or audio 
data. 

Definition of Interleaving 

[0258] To thus enable the user to selectively excise 
scenes and choose from among plural scenes, a state 
wherein non-selected scene data is recorded inserted 
between common scene data and selective scene data 
necessarily occurs because the data units associated 
with individual scenes are contiguously recorded to the 
recording tracks of the recording medium. If data is then 
read in the recorded sequence, non-selected scene 
data must be accessed before accessing and decoding 
the selected scene data, and seamless connections 
with the selected scene is difficult. The excellent ran- 
dom access characteristics of the digital video disk sys- 
tem, however, make seamless connections with the 
selected scenes possible. 

[0259] In other words, by splitting scene-specific data 
into plural units of a specified data size, and interleaving 
plural split data units for different scenes in a predefined 
sequence that is recorded to disk within the jumping 
range whereby an data underflow state does not occur, 
it is possible to reproduce the selected scenes without 
data interruptions by intermittently accessing and 
decoding the data specific to the selected scenes using 
these split data units. Seamless data reproduction is 
thereby assured. 

Interleaved block and Interleave unit 

[0260] The interleaving method enabling seamless 
data reproduction according to the present invention is 
described below with reference to Fig. 24 and Fig. 67. 
Shown in Fig. 24 is a case from which three scenarios 
may be derived, i.e.. branching from one video object 
VOB-A to one of plural video objects VOB-B, VOB-C, 
and VOB-D, and then merging back again to a single 
video object VOB-E. The actual arrangement of these 
blocks recorded to a data recording track TR on disk is 
shown in Fig. 67. 

[0261] Referring to Fig. 67, VOB-A and VOB-E are 
video objects with independent playback start and end 
times, and are in principle arrayed to contiguous block 
regions. As shown in Fig. 24, the playback start and end 
times of VOB-B, VOB-C, and VOB-D are aligned during 
interleaving. The interleaved data blocks are then 
recorded to disk to a contiguous interleaved block 
region. The contiguous block regions and interleaved 
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block regions are then written to disk in the track path Dr 
direction in the playback sequence. Plural video objects 
VOB, i.e., interleaved video objects VOBS, arrayed to 
the data recording track TR are shown in Fig. 67. 
[0262] Referring to Fig. 67, data regions to which data s 
is continuously arrayed are called "blocks," of which 
there are two types: "contiguous block regions" in which 
VOB with discrete starting and end points are contigu- 
ously arrayed, and "interleaved block regions" in which 
plural VOB with aligned starting and end points are io 
interleaved. The respective blocks are arrayed as 
shown in Fig. 68 in the playback sequence, i.e., block 1 , 
block 2, block 3,... block 7 

[0263] As shown in Fig. 68, the VTS title VOBS 
(VTSTT_VOBS) consist of blocks 1 - 7, inclusive. Block is 
1 contains VOB 1 alone. Blocks 2, 3, 5, and 7 similarly 
discretely contain VOBS 2, 3, 6, and 10. Blocks 2, 3, 5, 
and 7 are thus contiguous block regions. 
[0264] Block 4, however, contains VOB 4 and VOB 5 
interleaved together, while block 6 contains VOB 7, VOB 20 
8, and VOB 9 interleaved together. Blocks 4 and 6 are 
thus interleaved block regions. 

[0265] The internal data structure of the contiguous 
block regions is shown in Fig. 69 with VOB-i and VOB-j 
arrayed as the contiguous blocks in the VOBs. As 25 
described with reference to Fig. 16, VOB-i and VOB-j 
inside the contiguous block regions are further logically 
divided into cells as the playback unit. Both VOB-i and 
VOB-j in this figure are shown comprising three cells 
CELL #1 , CELL #2, and CELL #3. 30 
[0266] Each cell comprises one or more video object 
unit VOBU with the video object unit VOBU defining the 
boundaries of the cell. Each cell also contains informa- 
tion identifying the position of the cell in the program 
chain PGC (the playback control information of the dig- 35 
ital video disk system). More specifically, this position 
information is the address of the first and last VOBU in 
the cell. As also shown in Fig. 69, these VOB and the 
cells defined therein are also recorded to a contiguous 
block region so that contiguous blocks are contiguously 40 
reproduced. Reproducing these contiguous blocks is 
therefore no problem. 

[0267] The internal data structure of the interleaved 
block regions is shown in Fig. 70. In the interleaved 
block regions each video object VOB is divided into 45 
interleaved units ILVU, and the interleaved units ILVU 
associated with each VOB are alternately arrayed. Cell 
boundaries are defined independently of the interleaved 
units ILVU. For example, VOB-k is divided into four inter- 
leaved units ILVUkl, ILVUk2, ILVUk3, and ILVUk4. and so 
are confined by a single cell CELL#k. VOB-k is likewise 
divided into four interleaved units ILVUml, ILVUm2, 
ILVUm3, and ILVUm4, and is confined by a sincle cell 
CELL#m. Note that instead of a single cell CELL#k or 
CELL#m, each of VOB-k and VOB-m can be divided 55 
into more than two cells. The interleaved units ILVU thus 
contains both audio and video data. 
[0268] In the example shown in Fig. 70, the inter- 



leaved units ILVUkl, ILVUk2, ILVUk3, and ILVUk4, and 
ILVUml , ILVUm2, ILVUm3, and ILVUm4, from two dif- 
ferent video objects VOB-k and VOB-m are alternately 
arrayed within a single interleaved block. By interleaving 
the interleaved units ILVU of two video objects VOB in 
this sequence, it is possible to achieve seamless repro- 
duction branching from one scene to one of plural 
scenes, and from one of plural scenes to one scene. 

Multi-scene control 

[0269] The multi-scene period is described together 
with the concept of multi-scene control according to the 
present invention using by way of example a title com- 
prising scenes recorded from different angles. 
[0270] Each scene in multi-scene control is recorded 
from the same angle, but may be recorded at different 
times or may even be computer graphics data. The 
multi-angle scene periods may therefore also be called 
multi-scene periods. 

Parental control 

[0271] The concept of recording plural titles compris- 
ing alternative scenes for such functions as parental 
lock control and recording director's cuts is described 
below using Fig. 40. 

[0272] An example of a multi-rated title stream provid- 
ing for parental lock control is shown in Fig. 40. When 
so-called "adult scenes" containing sex, violence, or 
other scenes deemed unsuitable for children are con- 
tained in a title implementing parental lock control, the 
title stream is recorded with a combination of common 
system streams SSa, SSb, and Sse, an adult-oriented 
system stream SSc containing the adult scenes, and a 
child-oriented system stream SSd containing only the 
scenes suitable for children. Title streams such as this 
are recorded as a multi-scene system stream contain- 
ing the adult-oriented system stream Ssc and the child- 
oriented system stream Ssd arrayed to the multi-scene 
period between common system streams Ssb and Sse. 
[0273] The relationship between each of the compo- 
nent titles and the system stream recorded to the pro- 
gram chain PGC of a title stream thus comprised is 
described below. 

[0274] The adult-oriented title program chain PGC1 
comprises in sequence the common system streams 
Ssa and Ssb, the adult-oriented system stream Ssc, 
and the common system stream Sse. The child-oriented 
title program chain PGC2 comprises in sequence the 
common system streams Ssa and Ssb, the child-ori- 
ented system stream Ssd, and the common system 
stream Sse. 

[0275] By thus arraying the adult-oriented system 
stream Ssc and child-oriented system stream Ssd to a 
multi-scene period, the decoding method previously 
described can reproduce the title containing adult-ori- 
ented content by reproducing the common system 
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streams Ssa and Ssb, then selecting and reproducing 
the adult-oriented system stream Ssc, and then repro- 
ducing the common system stream Sse as instructed by 
the adult-oriented title program chain PGC1 . By alterna- 
tively following the child-oriented title program chain 5 
PGC2 and selecting the child-oriented system stream 
Ssd in the multi-scene period, a child-oriented title from 
which the adult-oriented scenes have been expurgated 
can be reproduced. 

[0276] This method of providing in the title stream a 10 
multi-scene period containing plural alternative scenes, 
selecting which of the scenes in the multi-scene period 
are to be reproduced before playback begins, and gen- 
erating plural titles containing essentially the same title 
content but different scenes in part, is called parental is 
lock control 

[0277] Note that parental lock control is so named 
because of the perceived need to protect children from 
undesirable content. From the perspective of system 
stream processing, however, parental lock control is a 20 
technology for statically generating different title 
streams by means of the user pre-selecting specific 
scenes from a multi-scene period. Note, further, that 
this contrasts with mufti-angle scene control, which is a 
technology for dynamically changing the content of a 2s 
single title by means of the user selecting scenes from 
the multi-scene period freely and in real-time during title 
playback. 

[0278] This parental lock control technology can also 
be used to enable title stream editing such as when 30 
making the director's cut. The director's cut refers to the 
process of editing certain scenes from a movie to, for 
example, shorten the total presentation time. This may 
be necessary, for example, to edit a feature-length 
movie for viewing on an airplane where the presentation 35 
time is too long for viewing within the flight time or cer- 
tain content may not be acceptable. The movie director 
thus determines which scenes may be cut to shorten 
the movie. The title can then be recorded with both a 
full-length, unedited system stream and an edited sys- 40 
tern stream in which the edited scenes are recorded to 
multi-scene periods. At the transition from one system 
stream to another system stream in such applications, 
parental lock control must be able to maintain smooth 
playback image output. More specifically, seamless 45 
data reproduction whereby a data underflow state does 
not occur in the audio, video, or other buffers, and 
seamless information reproduction whereby no unnatu- 
ral interruptions are audibly or visibly perceived in the 
audio and video playback, are necessary. so 

Multi-anale control 

[0279] The concept of multi-angle scene control in the 
present invention is described next with reference to 55 
Fig. 33. In general, multimedia titles are obtained by 
recording both the audio and video information (collec- 
tively "recording" below) of the subject over time T. The 



angled scene blocks #SC1, #SM1, #SM2, #SM3. and 
#SC3 represent the multimedia scenes obtained at 
recording unit times Tt, T2, and T3 by recording the 
subject at respective camera angles. Scenes #SM1, 
#SM2, and #SM3 are recorded at mutually different 
(first, second, and third) camera angles during record- 
ing unit time T2, and are referenced below as the first, 
second, and third angled scenes. 
[0280] Note that the multi-scene periods referenced 
herein are basically assumed to comprise scenes 
recorded from different angles. The scenes may, how- 
ever, be recorded from the same angle but at different 
times, or they may be computer graphics data. The 
multi-angle scene periods are thus the multi-scene peri- 
ods from which plural scenes can be selected for pres- 
entation in the same time period, whether or not the 
scenes are actually recorded at different camera 
angles. 

[0281] Scenes #SC1 and #SC3 are scenes recorded 
at the same common camera angle during recording 
unit times T1 and T3, i.e., before and after the multi- 
angle scenes. These scenes are therefore called "com- 
mon angle scenes." Note that one of the multiple cam- 
era angles used in the multi-angle scenes is usually the 
same as the common camera angle. 
[0282] To understand the relationship between these 
various angled scenes, mufti -angle scene control is 
described below using a live broadcast of a baseball 
game for example only. 

[0283] The common angle scenes #SC1 and #SC3 
are recorded at the common camera angle, which is 
here defined as the view from center field on the axis 
through the pitcher, batter, and catcher. 
[0284] The first angled scene #SM1 is recorded at the 
first multi-camera angle, i.e., the camera angle from the 
backstop on the axis through the catcher, pitcher, and 
batter. The second angled scene #SM2 is recorded at 
the second multi-camera angle, i.e., the view from 
center field on the axis through the pitcher, batter, and 
catcher. Note that the second angled scene #SM2 is 
thus the same as the common camera angle in this 
example. It therefore follows that the second angled 
scene #SM2 is the same as the common angle scene 
#SC2 recorded during recording unit time T2. The third 
angled scene #SM3 is recorded at the third multi-cam- 
era angle, i.e., the camera angle from the backstop 
focusing on the infield. 

[0285] The presentation times of the multiple angle 
scenes #SM1, #SM2, and #SM3 overlap in recording 
unit time T2; this period is called the "multi-angle scene 
period." By freely selecting one of the multiple angle 
scenes #SM1, #SM2, and #SM3 in this multi-angle 
scene period, the viewer is able to change his or her vir- 
tual viewing position to enjoy a different view of the 
game as though the actual camera angle is changed. 
Note that while there appears to be a time gap between 
common angle scenes #SC1 and #SC3 and the multiple 
angle scenes #SM1 , #SM2, and #SM3 in Fig. 33, this is 
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simply to facilitate the use of arrows in the figure for eas- 
ier description of the data reproduction paths repro- 
duced by selecting different angled scenes. There is no 
actual time gap during playback 

[0286] Multi-angle scene control of the system stream 5 
based on the present invention is described next with 
reference to Fig. 23 from the perspective of connecting 
data blocks. The multimedia data corresponding to 
common angle scene #SC is referenced as common 
angle data BA, and the common angle data BA in to 
recording unit times T1 and T3 are referenced as BA1 
and BA3, respectively. The multimedia data corre- 
sponding to the multiple angle scenes #SM1 , #SM2, 
and #SM3 are referenced as first, second, and third 
angle scene data MA1, MA2, and MA3. As previously is 
described with reference to Fig. 33, scenes from the 
desired angled can be viewed by selecting one of the 
multiple angle data units MA1 , MA2, and M A3. There is 
also no time gap between the common angle data BA1 
and BA3 and the multiple angle data units MA1, MA2, 20 
and MA3. 

[0287] In the case of an MPEG system stream, how- 
ever, intermittent breaks in the playback information can 
result between the reproduced common and multiple 
angle data units depending upon the content of the data 25 
at the connection between the selected multiple angle 
data unit MA1, MA2, and MA3 and the common angle 
data BA (either the first common angle data BA1 before 
the angle selected in the multi-angle scene period or the 
common angle data BA3 following the angle selected in 30 
the multi-angle scene period). The result in this case is 
that the title stream is not naturally reproduced as a sin- 
gle contiguous title, i.e.. seamless data reproduction is 
achieved but non-seamless information reproduction 
results. 35 
[0288] The multi-angle selection process whereby one 
of plural scenes is selectively reproduced from the mufti- 
angle scene period with seamless information presenta- 
tion to the scenes before and after is described below 
with application in a digital video disk system using Fig. 40 
23. 

[0289] Changing the scene angle, i.e., selecting one 
of the multiple angle data units MA1 , MA2, and MA3, 
must be completed before reproduction of the preced- 
ing common angle data BA1 is completed. It is 45 
extremely difficult, for example, to change to a different 
angle data unit MA2 during reproduction of common 
angle data BA1 . This is because the multimedia data 
has a variable length coded MPEG data structure, 
which makes it difficult to find the data break points so 
(boundaries) in the selected data blocks. The video may 
also be disrupted when the angle is changed because 
inter-frame correlations are used in the coding process. 
The group_of_pictures GOP processing unit of the 
MPEG standard contains at least one refresh frame, ss 
and closed processing not referencing frames belong- 
ing to another GOP is possible within this GOP process- 
ing unit. 



[0290] In other words, if the desired angle data, e. g., 
MA3, is selected before reproduction reaches the multi- 
angle scene period, and at the latest by the time repro- 
duction of the preceding common angle data BA1 is 
completed, the angle data selected from within the 
multi-angle scene period can be seamlessly repro- 
duced. However, it is extremely difficult while reproduc- 
ing one angle to select and seamlessly reproduce 
another angle within the same mufti-angle scene period. 
It is therefore difficult when in a multi-angle scene 
period to dynamically select a different angle unit pre- 
senting, for example, a view from a different camera 
angle. 

Flow chart: encoder 

[0291] The encoding information table generated by 
the encoding system controller 200 from information 
extracted from the scenario data St7 is described below 
referring to Fig. 27. 

[0292] The encoding information table contains VOB 
set data streams containing plural VOB corresponding 
to the scene periods beginning and ending at the scene 
branching and connecting points, and VOB data 
streams corresponding to each scene. These VOB set 
data streams shown in Fig. 27 are the encoding infor- 
mation tables generated at step #100 in Fig. 34 by the 
encoding system controller 200 for creating the DVD 
multimedia stream based on the user-defined title con- 
tent. 

[0293] The user-defined scenario contains branching 
points from common scenes to plural scenes, or con- 
nection points to other common scenes. The VOB cor- 
responding to the scene period delimited by these 
branching and connecting points is a VOB set. and the 
data generated to encode a VOB set is the VOB set 
data stream. The title number specified by the VOB set 
data stream is the title number TITLE_NO of the VOB 
set data stream. 

[0294] The VOB Set data structure in Fig. 27 shows 
the data content for encoding one VOB set in the VOB 
set data stream, and comprises: the VOB set number 
VOBS_NO, the VOB number VOB_NO in the VOB set, 
the preceding VOB seamless connection flag VOB_Fsb, 
the following VOB seamless connection flag VOB_Fsf, 
the multi-scene flag VOB_Fp, the interleave flag 
VOB_Fi, the multi-angle flag VOB_Fm, the multi-angle 
seamless switching flag VOB_FsV, the maximum bit 
rate of the interleaved VOB ILV_BR, the number of inter- 
leaved VOB divisions ILVJDIV, and the minimum inter- 
leaved unit presentation time ILVUJvlT. 
[0295] The VOB set number VOBS_NO is a sequen- 
tial number identifying the VOB set and the position of 
the VOB set in the reproduction sequence of the title 
scenario. 

[0296] The VOB number VOBJMO is a sequential 
number identifying the VOB and the position of the VOB 
in the reproduction sequence of the title scenario. 
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[0297] The preceding VOB seamless connection flag 
VOB_Fsb indicates whether a seamless connection 
with the preceding VOB is required for scenario repro- 
duction. 

[0298] The following VOB seamless connection flag s 
VOB_Fsf indicates whether there is a seamless con- 
nection with the following VOB during scenario repro- 
duction. 

[0299] The multi-scene flag VOB_Fp identifies 
whether the VOB set comprises plural video objects 10 
VOB. 

[0300] The interleave flag VOB_Fi identifies whether 

the VOB in the VOB set are interleaved. 

[0301] The multi-angle flag VOB_Fm identifies 

whether the VOB set is a multi-angle set. is 

[0302] The multi-angle seamless switching flag 

VOB_FsV identifies whether angle changes within the 

multi-angle scene period are seamless or not. 

[0303] The maximum bit rate of the interleaved VOB 

ILV_BR defines the maximum bit rate of the interleaved 20 

VOBs. 

[0304] The number of interleaved VOB divisions 
ILVJDIV identifies the number of interleave units in the 
interleaved VOB. 

[0305] The minimum interleave unit presentation time 25 
ILVU JvIT defines the time that can be reproduced when 
the bit rate of the smallest interleave unit at which a 
track buffer data underflow state does not occur is the 
maximum bit rate of the interleaved VOB ILV_BR during 
interleaved block reproduction. 30 
[0306] The encoding information table for each VOB 
generated by the encoding system controller 200 based 
on the scenario data St7 is described below referring to 
Fig. 28. The VOB encoding parameters described 
below and supplied to the video encoder 300, audio 35 
encoder 700, and system encoder 900 for stream 
encoding are produced based on this encoding informa- 
tion table. 

[0307] The VOB data streams shown in Fig. 28 are the 
encoding information tables generated at step #100 in 40 
Fig. 34 by the encoding system controller 200 for creat- 
ing the DVD multimedia stream based on the user- 
defined title content. 

[0308] The encoding unit is the video object VOB, and 
the data generated to encode each video object VOB is 45 
the VOB data stream. For example, a VOB set compris- 
ing three angle scenes comprises three video objects 
VOB. The data structure shown in Fig. 28 shows the 
content of the data for encoding one VOB in the VOB 
data stream. so 
[0309] The VOB data structure contains the video 
material start time VOB_VST, the video material end 
time VOB_VEND, the video signal type VOB_V_KIND, 
the video encoding bit rate V_BR, the audio material 
start time VOB_AST, the audio material end time ss 
VOB_AEND, the audio coding method VOB_A_KIND, 
and the audio encoding bit rate A_BR. 
[031 0] The video material start time VOB_VST is the 



video encoding start time corresponding to the time of 
the video signal. 

[031 1 ] The video material end time VOB_VEND is the 
video encoding end time corresponding to the time of 
the video signal. 

[0312] The video material type VOB_V_KIND identi- 
fies whether the encoded material is in the NTSC or PAL 
format, for example, or is photographic material (a 
movie, for example) converted to a television broadcast 
format (so-called telecine conversion). 
[031 3] The video encoding bit rate V_BR is the bit rate 
at which the video signal is encoded. 
[0314] The audio material start time VOB_AST is the 
audio encoding start time corresponding to the time of 
the audio signal. 

[031 5] The audio material end time VOB_AEND is the 
audio encoding end time corresponding to the time of 
the audio signal. 

[0316] The audio coding method VOB_A_KIND iden- 
tifies the audio encoding method as AC-3, MPEG, or lin- 
ear PCM, for example. 

[031 7] The audio encoding bit rate A_BR is the bit rate 
at which the audio signal is encoded. 
[0318] The encoding parameters used by the video 
encoder 300, sub-picture encoder 500. and audio 
encoder 700, and system encoder 900 for VOB encod- 
ing are shown in Fig. 29. The encoding parameters 
include: the VOB number VOB_NO, video encode start 
time V_STTM, video encode end time V_ENDTM, the 
video encode mode V_ENCMD, the video encode bit 
rate V_RATE, the maximum video encode bit rate 
V_MRATE, the GOP structure fixing flag GOP_Fxf lag, 
the video encode GOP structure GOPST. the initial 
video encode data V_INTST, the last video encode data 
V_ENDST, the audio encode start time A_STTM, the 
audio encode end time A_ENDTM, the audio encode bit 
rate A_RATE, the audio encode method A_ENCMD, the 
audio start gap A_STGAP, the audio end gap 
A_ENDGAP, the preceding VOB number B_VOB_NO, 
and the following VOB number F_VOB_NO. 
[0319] The VOB number VOB_NO is a sequential 
number identifying the VOB and the position of the VOB 
in the reproduction sequence of the title scenario. 
[0320] The video encode start time V_STTM is the 
start time of video material encoding. 
[0321] The video encode end time V_ENDTM is the 
end time of video material encoding. 
[0322] The video encode mode V_ENCMD is an 
encoding mode for declaring whether reverse telecine 
conversion shall be accomplished during video encod- 
ing to enable efficient coding when the video material is 
telecine converted material. 

[0323] The video encode bit rate V_RATE is the aver- 
age bit rate of video encoding. 

[0324] The maximum video encode bit rate V_M RATE 
is the maximum bit rate of video encoding. 
[0325] The GOP structure fixing flag GOP_Fxflag 
specifies whether encoding is accomplished without 
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changing the GOP structure in the middle of the video 
encoding process. This is a useful parameter for declar- 
ing whether seamless switch is enabled in a multi-angle 
scene period. 

[0326] The video encode GOP structure GOPST is s 
the GOP structure data from encoding. 
[0327] The initial video encode data VJNTST sets the 
initial value of the VB V buffer (decoder buffer) at the 
start of video encoding, and is referenced during video 
decoding to initialize the decoding buffer. This is a use- n 
ful parameter for declaring seamless reproduction with 
the preceding encoded video stream. 
[0328] The last video encode data V_ENDST sets the 
end value of the VBV buffer (decoder buffer) at the end 
of video encoding, and is referenced during video h 
decoding to initialize the decoding buffer. This is a use- 
ful parameter for declaring seamless reproduction with 
the preceding encoded video stream. 
[0329] The audio encode start time A_STTM is the 
start time of audio material encoding. 20 
[0330] The audio encode end time A_ENDTM is the 
end time of audio material encoding. 
[0331] The audio encode bit rate A_RATE is the bit 
rate used for audio encoding. 

[0332] The audio encode method A_ENCMD identi- 25 
ties the audio encoding method as AC-3, MPEG, or lin- 
ear PCM, for example. 

[0333] The audio start gap A_STGAP is the time off- 
set between the start of the audio and video presenta- 
tion at the beginning of a VOB. This is a useful 30 
parameter for declaring seamless reproduction with the 
preceding encoded system stream. 
[0334] The audio end gap A_ENDGAP is the time off- 
set between the end of the audio and video presentation 
at the end of a VOB. This is a useful parameter for 35 
declaring seamless reproduction with the preceding 
encoded system stream. 

[0335] The preceding VOB number B_VOB_NO is the 
VOBJMO of the preceding VOB when there is a seam- 
lessly connected preceding VOB. 40 
[0336] The following VOB number F_VOB_NO is the 
VOB_NO of the following VOB when there is a seam- 
lessly connected following VOB. 

[0337] The operation of a DVD encoder ECD accord- 
ing to the present invention is described below with ref- 45 
erence to the flow chart in Fig. 34. Note that the steps 
shown with a double line are subroutines. It should be 
obvious that while the operation described below 
relates specifically in this case to the DVD encoder ECD 
of the present invention, the operation described also so 
applies to an authoring encoder EC. 
[0338] At step #100, the user inputs the editing com- 
mands according to the user-defined scenario while 
confirming the content of the multimedia source data 
streams St1 , St2, and St3. 55 
[0339] At step #200, the scenario editor 100 gener- 
ates the scenario data St7 containing the above edit 
command information according to the user's editing 



instructions. 

[0340] When generating the scenario data St7 in step 
#200, the user editing commands related to multi-angle 
and parental lock multi-scene periods in which inter- 
leaving is presumed must be input to satisfy the follow- 
ing conditions. 

[0341 ] First, the VOB maximum bit rate must be set to 
assure sufficient image quality, and the track buffer 
capacity, jump performance, jump time, and jump dis- 
tance of the DVD decoder DCD used as the reproduc- 
tion apparatus of the DVD encoded data must be 
determined. Based on these values, the reproduction 
time of the shortest interleaved unit is obtained from 
equations 3 and 4. Based on the reproduction time of 
each scene in the multi-scene period, it must then be 
determined whether equations 5 and 6 are satisfied. If 
equations 5 and 6 are not satisfied, the user must 
change the edit commands until equations 5 and 6 are 
satisfied by, for example, connecting part of the follow- 
ing scene to each scene in-the multi-scene period. 
[0342] When multi-angle edit commands are used, 
equation 7 must be satisfied for seamless switching, 
and edit commands matching the audio reproduction 
time with the reproduction time of each scene in each 
angle must be entered. If non-seamless switching is 
used, the user must enter commands to satisfy equation 
8. 

[0343] At step #300, the encoding system controller 
200 first determines whether the target scene is to be 
seamlessly connected to the preceding scene based on 
the scenario data St7. 

[0344] Note that when the preceding scene period is 
a multi-scene period comprising plural scenes but the 
presently selected target scene is a common scene (not 
in a multi-scene period), a seamless connection refers 
to seamlessly connecting the target scene with any one 
of the scenes contained in the preceding multi-scene 
period. When the target scene is a multi-scene period, a 
seamless connection still refers to seamlessly connect- 
ing the target scene with any one of the scenes from the 
same multi-scene period. 

[0345] If step #300 returns NO, i.e., a non-seamless 
connection is valid, the procedure moves to step #400. 
[0346] At step #400, the encoding system controller 
200 resets the preceding VOB seamless connection 
flag VOB_Fsb indicating whether there is a seamless 
connection between the target and preceding scenes. 
The procedure then moves to step #600. 
[0347] On the other hand, if step #300 returns YES, 
i.e., there is a seamless connection to the preceding 
scene, the procedure moves to step #500. 
[0348] At step #500 the encoding system controller 
200 sets the preceding VOB seamless connection flag 
VOB_Fsb. The procedure then moves to step #600. 
[0349] At step #600 the encoding system controller 
200 determines whether there is a seamless connection 
between the target and following scenes based on sce- 
nario data St7. If step #600 returns NO, i.e., a non- 
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seamless connection is valid, the procedure moves to 
step #700. 

[0350] At step #700, the encoding system controller 
200 resets the following VOB seamless connection flag 
VOB_Fsf indicating whether there is a seamless con- 
nection with the following scene. The procedure then 
moves to step #900. 

[0351] However, if step #600 returns YES. i.e., there 
is a seamless connection to the following scene, the 
procedure moves to step #800. 

[0352] At step #800 the encoding system controller 
200 sets the following VOB seamless connection flag 
VOB_Fsf. The procedure then moves to step #900. 
[0353] At step #900 the encoding system controller 
200 determines whether there is more than connection 
target scene, i.e., whether a multi-scene period is 
selected, based on the scenario data St7. As previously 
described, there are two possible control methods in 
multi-scene periods: parental lock control whereby only 
one of plural possible reproduction paths that can be 
constructed from the scenes in the multi-scene period is 
reproduced, and multi-angle control whereby the repro- 
duction path can be switched within the multi-scene 
period to present different viewing angles. 
[0354] If step #900 returns NO, i.e., there are not mul- 
tiple scenes, the procedure moves to step #1000. 
[0355] At step #1000 the multi-scene flag VOB_Fp 
identifying whether the VOB set comprises plural video 
objects VOB (a multi-scene period is selected) is reset, 
and the procedure moves to step #1800 for encode 
parameter production. This encode parameter produc- 
tion subroutine is described below. 
[0356] However, if step #900 returns YES, there is a 
multi-scene connection, the procedure moves to step 
#1100. 

[0357] At step #1 1 00, the multi-scene flag VOB_Fp is 
set, and the procedure moves to step #1200 whereat it 
is judged whether a multi-angle connection is selected, 
or not. 

[0358] At step #1200 it is determined whether a 
change is made between plural scenes in the multi- 
scene period, i.e., whether a multi-angle scene period is 
selected. If step #1200 returns NO. i.e., no scene 
change is allowed in the multi-scene period as parental 
lock control reproducing only one reproduction path has 
been selected, the procedure moves to step #1300. 
[0359] At step #1300 the multi-angle flag VOB_Fm 
identifying whether the target connection scene is a 
multi-angle scene is reset, and the procedure moves to 
step #1302. 

[0360] At step #1302 it is determined whether either 
the preceding VOB seamless connection flag VOB_Fsb 
or following VOB seamless connection flag VOB_Fsf is 
set. If step #1302 returns YES, i.e.. the target connec- 
tion scene seamlessly connects to the preceding, the 
following, or both the preceding and following scenes, 
the procedure moves to step #1304. 
[0361] At step #1304 the interleave flag VOB_Fi iden- 



tifying whether the VOB, the encoded data of the target 
scene, is interleaved is set. The procedure then moves 
to step #1800. 

[0362] However, if step #1302 returns NO, i.e., the tar- 
5 get connection scene does not seamlessly connect to 
the preceding or following scene, the procedure moves 
to step #1306. 

[0363] At step #1306 the interleave flag VOB_Fi is 
reset, and the procedure moves to step #1800. 
10 [0364] If step #1 200 returns YES, however, i.e., there 
is a multi-angle connection, the procedure moves to. 
step #1400. 

[0365] At step #1400, the multi-angle flag VOB_Fm 
and interleave flag VOB_Fi are set, and the procedure 

is moves to step #1 500. 

[0366] At step #1500 the encoding system controller 
200 determines whether the audio and video can be 
seamlessly switched in a multi-angle scene period, i.e., 
at a reproduction unit smaller than the VOB, based on 

20 the scenario data St7. If step #1500 returns NO, i.e.. 
non-seamless switching occurs, the procedure moves 
to step #1600. 

[0367] At step #1600 the multi-angle seamless switch- 
ing flag VOB_FsV indicating whether angle changes 
25 within the multi-angle scene period are seamless or not 
is reset, and the procedure moves to step #1800. 
[0368] However, if step #1 500 returns YES, i.e., seam- 
less switching occurs, the procedure moves to step 
#1700. 

30 [0369] At step #1 700 the multi-angle seamless switch- 
ing flag VOB_FsV is set, and the procedure moves to 
step #1800. 

[0370] Therefore, as shown by the flow chart in Fig. 
34, encode parameter production (step #1800) is only 

35 begun after the editing information is detected from the 
above flag settings in the scenario data St7 reflecting 
the user-defined editing instructions. 
[0371 ] Based on the user-defined editing instructions 
detected from the above flag settings in the scenario 

40 data St7, information is added to the encoding informa- 
tion tables for the VOB Set units and VOB units as 
shown in Figs. 27 and 28 to encode the source streams, 
and the encoding parameters of the VOB data units 
shown in Fig. 29 are produced, in step #1800. The pro- 

45 cedure then moves to step #1900 for audio and video 
encoding. 

[0372] The encode parameter production steps (step 
#1800) are described in greater detail below referring to 
Figs. 35, 36, 37. and 38. 

so [0373] Based on the encode parameters produced in 
step #1800, the video data and audio data are encoded 
in step #1 900, and the procedure moves to step #2000. 
[0374] Note that the sub-picture data is normally 
inserted during video reproduction on an as-needed 

55 basis, and contiguity with the preceding and following 
scenes is therefore not usually necessary. Moreover, 
the sub-picture data is normally video information for 
one frame, and unlike audio and video data having an 
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extended time-base, sub-picture data is usually static, 
and is not normally presented continuously. Because 
the present invention relates specifically to seamless 
and non-seamless contiguous reproduction as 
described above, description of sub-picture data encod- 5 
ing is omitted herein for simplicity. 
[0375] Step #2000 is the last step in a loop comprising 
steps #300 to step #2000, and causes this loop to be 
repeated as many times as there are VOB Sets. This 
loop formats the program chain VTS_PGC#i to contain 10 
the reproduction sequence and other reproduction infor- 
mation for each VOB in the title (Fig. 16) in the program 
chain data structure, interleaves the VOB in the multi- 
scene periods, and completes the VOB Set data stream 
and VOB data stream needed for system stream encod- is 
ing. The procedure then moves to step #2100. 
[0376] At step #21 00 the VOB Set data stream is com- 
pleted as the encoding information table by adding the 
total number of VOB Sets VOBS_NUM obtained as a 
result of the loop through step #2000 to the VOB Set 20 
data stream, and setting the number of titles TITLE_NO 
defining the number of scenario reproduction paths in 
the scenario data St7. The procedure then moves to 
step #2200. 

[0377] System stream encoding producing the VOB 25 
(VOB#i) data in the VTS title VOBS (VTSTT_VOBS) 
(Fig. 16) is accomplished in step #2200 based on the 
encoded video stream and encoded audio stream out- 
put from step #1900, and the encode parameters in Fig. 
29. The procedure then moves to step #2300. 30 
[0378] At step #2300 the VTS information VTSI. VTSI 
management table VTSLMAT, VTSPGC information 
table VTS_PGCIT, and the program chain information 
VTS_PGCI#i controlling the VOB data reproduction 
sequence shown in Fig. 16 are produced, and format- 35 
ting to. for example, interleave the VOB contained in the 
multi-scene periods, is accomplished. 
[0379] The encode parameter production subroutine 
shown as step #1800 in Fig. 34B is described next using 
Figs. 35, 36, and 37 using by way of example the oper- 40 
ation generating the encode parameters for multi-angle 
control. 

[0380] Starting from Fig. 35, the process for generat- 
ing the encode parameters of a non-seamless switching 
stream with multi-angle control is described first. This 45 
stream is generated when step #1500 in Fig. 34 returns 
NO and the following flags are set as shown: VOB_Fsb 
= 1 or VOB_Fsf = 1 , VOB_Fp = 1 , VOB_Fi = 1 , VOB_Fm 
= 1, and VOB_FsV = 0. The following operation pro- 
duces the encoding information tables shown in Fig. 27 so 
and Fig. 28, and the encode parameters shown in Fig. 
29. 

[0381] At step #1812, the scenario reproduction 
sequence (path) contained in the scenario data St7 is 
extracted, the VOB Set number VOBS_NO is set, and ss 
the VOB number VOBJMO is set for one or more VOB in 
the VOB Set. 

[0382] At step #1814 the maximum bit rate ILV_BR of 



the interleaved VOB is extracted from the scenario data 
St7, and the maximum video encode bit rate VJvlRATE 
from the encode parameters is set based on the inter- 
leave flag VOB_Fi setting (= 1). 

[0383] At step #1816, the minimum interleaved unit 
presentation time ILVUJvlT is extracted from the sce- 
nario data St7. 

[0384] At step #1818, the video encode GOP struc- 
ture GOPST values N = 15 and M = 3 are set, and the 
GOP structure fixing flag GOP_Fxflag is set (= 1), 
based on the multi-scene flag VOB_Fp setting (= 1). 
[0385] Step #1820 is the common VOB data setting 
routine, which is described below referring to the flow 
chart in Fig. 36. This common VOB data setting routine 
produces the encoding information tables shown in 
Figs. 27 and 28, and the encode parameters shown in 
Fig. 29. 

[0386] At step #1822 the video material start time 
VOB_VST and video material end time VOB_VEND are 
extracted for each VOB, and the video encode start time 
V_STTM and video encode end time V_ENDTM are 
used as video encoding parameters. 
[0387] At step #1824 the audio material start time 
VOB_AST of each VOB is extracted from the scenario 
data St7, and the audio encode start time A_STTM is 
set as an audio encoding parameter. 
[0388] At step #1826 the audio material end time 
VOB_AEND is extracted for each VOB from the sce- 
nario data St7, and at a time not exceeding the 
VOB_AEND time. This time extracted at an audio 
access unit (AAU) is set as the audio encode end time 
A_ENDTM which is an audio encoding parameter. Note 
that the audio access unit AAU is determined by the 
audio encoding method. 

[0389] At step #1 828 the audio start gap A_STGAP 
obtained from the difference between the video encode 
start time V_STTM and the audio encode start time 
A_STTM is defined as a system encode parameter. 
[0390] At step #1 830 the audio end gap A_EN DGAP 
obtained from the difference between the video encode 
end time V_ENDTM and the audio encode end time 
A_ENDTM is defined as a system encode parameter. 
[0391 ] At step #1 832 the video encoding bit rate V_B R 
is extracted from the scenario data St7 t and the video 
encode bit rate V_RATE, which is the average bit rate of 
video encoding, is set as a video encoding parameter. 
[0392] At step #1834 the audio encoding bit rate 
A_BR is extracted from the scenario data St7, and the 
audio encode bit rate A_RATE is set as an audio encod- 
ing parameter. 

[0393] At step #1836 the video material type 
VOB_V_KIND is extracted from the scenario data St7. If 
the material is a film type, i.e., a movie converted to tel- 
evision broadcast format (so-called telecine conver- 
sion), reverse telecine conversion is set for the video 
encode mode V_ENCMD, and defined as a video 
encoding parameter. 

[0394] At step #1838 the audio coding method 
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VOB_A_KIND is extracted from the scenario data St7 ( 
and the encoding method is set as the audio encode 
method A_ENCMD and set as an audio encoding 
parameter. 

[0395] At step #1840 the initial video encode data s 
VJNTST sets the initial value of the VBV buffer to a 
value less than the VBV buffer end value set by the last 
video encode data V_ENDST, and defined as a video 
encoding parameter. 

[0396] At step #1842 the VOB number VOB_NO of the w 
preceding connection is set to the preceding VOB 
number B_VOB_NO based on the setting (= 1) of the 
preceding VOB seamless connection flag VOB_Fsb, 
and set as a system encode parameter. 
[0397] At step #1 844 the VOB number VOB_NO of the is 
following connection is set to the following VOB number 
F_VOB_NO based on the setting (= 1) of the following 
VOB seamless connection flag VOB_Fsf, and set as a 
system encode parameter. 

[0398] The encoding information table and encode 20 
parameters are thus generated for a multi-angle VOB 
Set with non-seamless multi-angle switching control 
enabled. 

[0399] The process for generating the encode param- 
eters of a seamless switching stream with multi-angle 25 
control is described below with reference to Fig. 37. 
This stream is generated when step #1500 in Fig. 34 
returns YES and the following flags are set as shown: 
VOB_Fsb = 1 or VOB_Fsf = 1, VOB_Fp = 1, VOB_Fi = 
1 , VOB_Fm = 1 , and VOB_Fs V = 1 . The following oper- 30 
ation produces the encoding information tables shown 
in Fig. 27 and Fig. 28, and the encode parameters 
shown in Fig. 29. 

[0400] The following operation produces the encoding 
information tables shown in Fig. 27 and Fig. 28. and the 35 
encode parameters shown in Fig. 29. 
[0401] At step #1850, the scenario reproduction 
sequence (path) contained in the scenario data St7 is 
extracted, the VOB Set number VOBS_NO is set, and 
the VOB number VOB_NO is set for one or more VOB in 40 
the VOB Set. 

[0402] At step #1 852 the maximum bit rate ILV_BR of 
the interleaved VOB is extracted from the scenario data 
St7, and the maximum video encode bit rate V_MRATE 
from the encode parameters is set based on the inter- 4s 
leave flag VOB_Fi setting (= 1). 
[0403] At step #1854, the minimum interleaved unit 
presentation time ILVU_MT is extracted from the sce- 
nario data St7. 

[0404] At step #1856, the video encode GOP struc- so 
ture GOPST values N = 15 and M = 3 are set, and the 
GOP structure fixing flag GOP_Fxflag is set (= 1), 
based on the multi-scene flag VOB_Fp setting (= 1). 
[0405] At step #1 858, the video encode GOP GOPST 
is set to "closed GOP" based on the multi-angle seam- 55 
less switching flag VOB_FsV setting (= 1), and the video 
encoding parameters are thus defined. 
[0406] Step #1860 is the common VOB data setting 



routine, which is as described referring to the flow chart 
in Fig. 35. Further description thereof is thus omitted 
here. 

[0407] The encode parameters of a seamless switch- 
ing stream with multi-angle control are thus defined for 
a VOB Set with multi-angle control as described above. 
[0408] The process for generating the encode param- 
eters for a system stream in which parental lock control 
is implemented is described below with reference to Fig. 
38. This stream is generated when step #1200 in Fig. 34 
returns NO and step #1304 returns YES, i.e., the follow- 
ing flags are set as shown: VOB_Fsb = 1 or VOB_Fsf = 
1 , VOB_Fp = 1 , VOB_Fi = 1 , VOB_Fm = 0. The follow- 
ing operation produces the encoding information tables 
shown in Fig. 27 and Fig. 28, and the encode parame- 
ters shown in Fig. 29. 

[0409] At step #1870, the scenario reproduction 
sequence (path) contained in the scenario data St7 is 
extracted, the VOB Set number VOBS_NO is set. and 
the VOB number VOB_NO is set for one or more VOB in 
the VOB Set. 

[041 0] At step #1 872 the maximum bit rate ILV_BR of 
the interleaved VOB is extracted from the scenario data 
St7, and the maximum video encode bit rate V_MRATE 
from the encode parameters is set based on the inter- 
leave flag VOB_F» setting (= 1). 

[0411] At step #1872 the number of interleaved VOB 
divisions ILVJDIV is extracted from the scenario data 
St7. 

[0412] Step #1876 is the common VOB data setting 
routine, which is as described referring to the flow chart 
in Fig. 35. Further description thereof is thus omitted 
here. 

[0413] The encode parameters of a system stream in 
which parental lock control is implemented are thus 
defined for a VOB Set with multi-scene selection control 
enabled as described above. 

[0414] The process for generating the encode param- 
eters for a system stream containing a single scene is 
described below with reference to Fig. 32. This stream 
is generated when step #900 in Fig. 34 returns NO, i.e., 
when VOB_Fp=0. The following operation produces the 
encoding information tables shown in Fig. 27 and Fig. 
28, and the encode parameters shown in Fig. 29. 
[0415] At step #1880, the scenario reproduction 
sequence (path) contained in the scenario data St7 is 
extracted, the VOB Set number VOBS_NO is set, and 
the VOB number VOB_NO is set for one or more VOB in 
the VOB Set. 

[041 6] At step #1 882 the maximum bit rate ILV_BR of 
the interleaved VOB is extracted from the scenario data 
St7, and the maximum video encode bit rate VJvlRATE 
from the encode parameters is set based on the inter- 
leave flag VOB_Fi setting (= 1). 

[0417] Step #1884 is the common VOB data setting 
routine, which is as described referring to the flow chart 
in Fig. 35. Further description thereof is thus omitted 
here. 
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[0418] These flow charts for defining the encoding 
information table and encode parameters thus generate 
the parameters for DVD video, audio, and system 
stream encoding by the DVD formatter. 

5 

Decoder flow charts 
Disk-to-stream buffer transfer flow 

[0419] The decoding information table produced by u 
the decoding system controller 2300 based on the sce- 
nario selection data St51 is described below referring to 
Figs. 58 and 59. The decoding information table com- 
prises the decoding system table shown in Fig. 58, and 
the decoding table shown in Fig. 59. is 
[0420] As shown in Fig. 58, the decoding system table 
comprises a scenario information register and a cell 
information register. The scenario information register 
records the title number and other scenario reproduc- 
tion information selected by the user and extracted from 20 
the scenario selection data St51. The cell information 
register extracts and records the information required to 
reproduce the cells constituting the program chain PGC 
based on the user-defined scenario information 
extracted into the scenario information register. 25 
[0421 ] More specifically, the scenario information reg- 
ister contains plural sub-registers, i.e., the angle 
number ANGLE_NO_reg, VTS number VTS_NO_reg, 
PGC number VTS_PGC LNO_/eg , audio ID 
AUDIO_ID_reg, sub-picture ID SPJD_reg, and the sys- 30 
tern clock reference SCR buffer SCRJxiffer. 
[0422] The angle number ANG LE_NO_jeg stores 
which angle is reproduced when there are multiple 
angles in the reproduction program chain PGC. 
[0423] The VTS number VTS_NO_reg records the 35 
number of the next VTS reproduced from among the 
plural VTS on the disk 

[0424] The PGC number VTS_PGC !_NO_r eg records 
which of the plural program chains PGC present in the 
video title set VTS is to be reproduced for parental lock 40 
control or other applications. 

[0425] The audio ID AUDIOJD_reg records which of 
the plural audio streams in the VTS are to be repro- 
duced. 

[0426] The sub-picture ID SP_ID_reg records which of 45 
the plural sub-picture streams is to be reproduced when 
there are plural sub-picture streams in the VTS. 
[0427] The system clock reference SCR buffer 
SCR_buffer is the buffer for temporarily storing the sys- 
tem clock reference SCR recorded to the pack header so 
as shown in Fig. 19. As described using Fig. 26, this 
temporarily stored system clock reference SCR is out- 
put to the decoding system controller 2300 as the bit- 
stream control data St63. 

[0428] The cell information register contains the fol- ss 
lowing sub-registers: the cell block mode CBM_reg, cell 
block type CBT_reg, seamless reproduction flag 
SPF_reg, interleaved allocation flag IAF_reg ( STC 



resetting flag STCDF, seamless angle change flag 
SACF_jeg, first cell VOBU start address 
C_FVOBU_SA_reg, and last cell VOBU start address 
C_LVOBU_SA_reg. 

[0429] The cell block mode CBM_reg stores a value 
indicating whether plural cells constitute one functional 
block. If there are not plural cells in one functional block, 
CBM_reg stores N_BLOCK. If plural cells constitute one 
functional block, the value F_CELL is stored as the 
CBM_reg value of the first cell in the block, L_CELL is 
stored as the CBM_reg value of the last cell in the block, 
and BLOCK is stored as the CBM_reg of value all cells 
between the first and last cells in the block. 
[0430] The cell block type CBTjeg stores a value 
defining the type of the block indicated by the cell block 
mode CBM_reg. If the cell block is a multi-angle block, 
A_BLOCK is stored; if not. N_BLOCK is stored. 
[0431] The seamless reproduction flag SPF_reg 
stores a value defining whether that cell is seamless 
connected with the cell or cell block reproduced there- 
before. If a seamless connection is specified, SML is 
stored; if a seamless connection is not specified, NSML 
is stored. 

[0432] The interleaved allocation flag IAF_reg stores 
a value identifying whether the cell exists in a contigu- 
ous or interleaved block. If the cell is part of a an inter- 
leaved block, ILVB is stored; otherwise NJLVB is 
stored. 

[0433] The STC resetting flag STCDF defines whether 
the system time clock STC used for synchronization 
must be reset when the cell is reproduced; when reset- 
ting the system time clock STC is necessary, 
STC_RESET is stored; if resetting is not necessary, 
STC_N RESET is stored. 

[0434] The seamless angle change flag SACF_reg 
stores a value indicating whether a cell in a multi-angle 
period should be connected seamlessly at an angle 
change. If the angle change is seamless, the seamless 
angle change flag SACF is set to SML; otherwise it is 
set to NSML. 

[0435] The first cell VOBU start address 

C_FVOBU_SA_reg stores the VOBU start address of 
the first cell in a block. The value of this address is 
expressed as the distance from the logic sector of the 
first cell in the VTS title VOBS (VTSTT_VOBS) as 
measured by and expressed (stored) as the number of 
sectors. 

[0436] The last cell VOBU start address 

C_LVOBU_SA_reg stores the VOBU start address of 
the last cell in the block. The value of this address is 
also expressed as the distance from the logic sector of 
the first cell in the VTS title VOBS (VTSTT_VOBS) 
measured by and expressed (stored) as the number of 
sectors. 

[0437] The decoding table shown in Fig. 59 is 
described below. As shown in Fig. 59, the decoding 
table comprises the following registers: information reg- 
isters for non-seamless multi-angle control, information 
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registers for seamless multi -angle control, a VOBU 
information register, and information registers for seam- 
less reproduction. 

[0438] The information registers for non-seamless 
multi-angle control comprise sub-registers s 
NSM L_AGL_C 1 JDSTA_reg 
NSML_AGL_C9_DSTA_/eg. 
[0439] NSML_AGL_Cl_DSTA_/eg - 
NSM L_AGL_C9_DSTA_reg record the 

NM SL_AGL_C 1 _DSTA - NMSL_AGL_C9_D STA values 10 
in the PCI packet shown in Fig. 20. 
[0440] The information registers for seamless multi- 
angle control comprise sub-registers 
SML_AGL_C1_DSTA_jeg - SML_AGL_C9_DSTA_reg . 
[0441] SML_AGL_C1_DSTA_reg - 15 

SML_AGL_C9_DSTA_reg record the 

SM L_AGL_C 1 _D STA - SML_AGL_C9_DSTA values in 
the DSI packet shown in Fig. 20. 
[0442] The VOBU information register stores the end 
pack address VOBU_EA in the DSI packet shown in Fig. 20 
20. 

[0443] The information registers for seamless repro- 
duction comprise the following sub-registers: an inter- 
leaved unit flag ILVU_flag_reg, Unit END flag 
UNIT_END_flag_reg, Interleaved Unit End Address 2s 
ILVU_EA_reg, Next Interleaved Unit Start Address 
NT_ILVU_SA_reg, the presentation start time of the first 
video frame in the VOB (Initial Video Frame Presenta- 
tion Start Time) VOB_V_SPTM_/eg, the presentation 
end time of the last video frame in the VOB (Final Video so 
Frame Presentation Termination Time) 
VOB_V_EPTM_reg, audio reproduction stopping time 1 
VOB_A_STP_PTM1_reg, audio reproduction stopping 
time 2 VOB_A_STP_PTM2__reg, audio reproduction 
stopping period 1 VOB_A_GAP_LEN1_reg, and audio 35 
reproduction stopping period 2 

VOB_A_G AP_LE N2__reg . 

[0444] The interleaved unit flag ILVU_f lag_reg stores 
the value indicating whether the video object unit VOBU 
is in an interleaved block, and stores ILVU rf it is, and 40 
NJLVUif not. 

[0445] The Unit END flag UNIT_ENDJIag_reg stores 
the value indicating whether the video object unit VOBU 
is the last VOBU in the interleaved unit ILVU. Because 
the interleaved unit ILVU is the data unit for continuous 45 
reading, the UNIT_END_flag_reg stores END if the 
VOBU currently being read is the last VOBU in the inter- 
leaved unit ILVU, and otherwise stores N_END. 
[0446] The Interleaved Unit End Address 

ILVU_EA_reg stores the address of the last pack in the so 
ILVU to which the VOBU belongs if the VOBU is in an 
interleaved block. This address is expressed as the 
number of sectors from the navigation pack NV of that 
VOBU. 

[0447] The Next Interleaved Unit Start Address 55 
NTJLVU_SA_reg stores the start address of the next 
interleaved unit ILVU if the VOBU is in an interleaved 
block. This address is also expressed as the number of 



sectors from the navigation pack NV of that VOBU. 
[0448] The Initial Video Frame Presentation Start 
Time register VOB_V_SPTM_reg stores the time at 
which presentation of the first video frame in the VOB 
starts. 

[0449] The Final Video Frame Presentation Termina- 
tion Time register VOB_V_EPTM_reg stores the time at 
which presentation of the last video frame in the VOB 
ends. 

[0450] The audio reproduction stopping time 1 
VOB_A_STP_PTM1_reg stores the time at which the 
audio is to be paused to enable ^synchronization, and 
the audio reproduction stopping period 1 
VOB_A_GAP_LEN1_reg stores the length of this pause 
period. 

[0451] The audio reproduction stopping time 2 
VOB_A_STP_PTM2_reg and audio reproduction stop- 
ping period 2 VOB_A_GAP_LEN2_reg store the same 
values. 

[0452] The operation of the DVD decoder DCD 
according to the present invention as shown in Fig. 26 is 
described next below with reference to the flow chart in 
Fig. 60. 

[0453] At step #310202 it is first determined whether 
a disk has been inserted. If it has, the procedure moves 
to step #310204. 

[0454] At step #310204, the volume file structure VFS 
(Fig. 21) is read, and the procedure moves to step 
#310206. 

[0455] At step #31 0206. the video manager VMG (Fig. 
21) is read and the video title set VTS to be reproduced 
is extracted. The procedure then moves to step 
#310208. 

[0456] At step #310208, the video title set menu 
address information VTSM_C_ADT is extracted from 
the VTS information VTSI, and the procedure moves to 
step #310210. 

[0457] At step #310210 the video title set menu 
VTSM_VOBS is read from the disk based on the video 
title set menu address information VTSM_C_ADT, and 
the title selection menu is presented. 
[0458] The user is thus able to select the desired title 
from this menu in step #31 0212. If the titles include both 
contiguous titles with no user-selectable content, and 
titles containing audio numbers, sub-picture numbers, 
or multi-angle scene content, the user must also enter 
the desired angle number. Once the user selection is 
completed, the procedure moves to step #310214. 
[0459] At step #310214, the VTS_PGCI #i program 
chain (PGC) data block corresponding to the title 
number selected by the user is extracted from the 
VTSPGC information table VTS_PGCIT, and the proce- 
dure moves to step #310216. 

[0460] Reproduction of the program chain PGC then 
begins at step #310216. When program chain PGC 
reproduction is finished, the decoding process ends. If a 
separate title is thereafter to be reproduced as deter- 
mined by monitoring key entry to the scenario selector. 
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the title menu is presented again (step #310210). 
[0461] Program chain reproduction in step #310216 
above is described in further detail below referring to 
Fig. 61 . The program chain PGC reproduction routine 
consists of steps #31030, #31032, #31034, and #31035 s 
as shown. 

[0462] At step #31030 the decoding system table 
shown in Fig. 58 is defined. The angle number 
ANGLE_NO_reg, VTS number VTS_NO_reg, PGC 
number VTS_PGCI_NO_reg, audio ID AUDIO_ID_reg, w 
and sub-picture ID SP_ID_reg are set according to the 
selections made by the user using the scenario selector 
2100. 

[0463] Once the PGC to be reproduced is determined, 
the corresponding cell information (PGC information is 
entries C_PBI #j) is extracted and the cell information 
register is defined. The sub-registers therein that are 
defined are the cell block mode CBM_reg, cell block 
type CBT_reg, seamless reproduction flag SPF_reg, 
interleaved allocation flag IAF_reg, STC resetting flag 20 
STCDF, seamless angle change flag SACF_reg, first 
cell VOBU start address C_FVOBU_SA_reg, and last 
cell VOBU start address C_LVOBU_SA_reg. 
[0464] Once the decoding system table is defined, the 
process transferring data to the stream buffer (step 25 
#31032) and the process decoding the data in the 
stream buffer (step #31034) are activated in parallel. 
[0465] The process transferring data to the stream 
buffer (step #31032) is the process of transferring data 
from the recording medium M to the stream buffer 2400. 30 
This is, therefore, the processing of reading the required 
data from the recording medium M and inputting the 
data to the stream buffer 2400 according to the user- 
selected title information and the playback control infor- 
mation (navigation packs NV) written in the stream. 35 
[0466] The routine shown as step #31034 is the proc- 
ess for decoding the data stored to the stream buffer 
2400 (Fig. 26), and outputting the decoded data to the 
video data output terminal 3600 and audio data output 
terminal 3700. Thus, is the process for decoding and 40 
reproducing the data stored to the stream buffer 2400. 
[0467] Note that step #31032 and step #31034 are 
executed in parallel. 

[0468] The processing unit of step #31032 is the cell, 
and as processing one cell is completed, it is deter- 45 
mined in step #31035 whether the complete program 
chain PGC has been processed. If processing the com- 
plete program chain PGC is not completed, the decod- 
ing system table is defined for the next cell in step 
#31030. This loop from step #31030 through step so 
#31035 is repeated until the entire program chain PGC 
is processed. 

[0469] The stream buffer data transfer process of step 
#31032 is described in further detail below referring to 
Fig. 62. The stream buffer data transfer process (step ss 
#31032) comprises steps #31040, #31042, #31044, 
#31046, and #31048 shown in the figure. 
[0470] At step #31040 it is determined whether the 



cell is a multi-angle cell. If not, the procedure moves to 
step #30144. 

[0471 ] At step #3 1 044 the non-multi-angle cell decod- 
ing process is executed. 

[0472] However, if step #301 40 returns YES because 
the cell is a multi-angle cell, the procedure moves to 
step #30142 where the seamless angle change flag 
SACF is evaluated to determine whether seamless 
angle reproduction is specified. 

[0473] If seamless angle reproduction is specified, the 
seamless multi-angle decoding process is executed in 
step #30146. If seamless angle reproduction is not 
specified, the non-seamless multi-angle decoding proc- 
ess is executed in step #30148. 
[0474] The non-multi-angle cell decoding process 
(step #31044, Fig. 62) is described further below with 
reference to Fig. 63. Note that the non-multi-angle cell 
decoding process (step #31044) comprises the steps 
#31050, #31052, and #31054. 

[0475] The first step #31 050 evaluates the interleaved 
allocation flag IAF_reg to determine whether the cell is 
in an interleaved block, tf it is, the non-multi-angle inter- 
leaved block process is executed in step #31052. 
[0476] The non-multi-angle interleaved block process 
(step #31052) processes scene branching and connec- 
tion where seamless connections are specified in, for 
example, a multi-scene period. 

[0477] However, if the cell is not in an interleaved 
block, the non-multi-angle contiguous block process is 
executed in step #31054. Note that the step #31054 
process is the process executed when there is no scene 
branching or connection. 

[0478] The non-multi-angle interleaved block process 
(step #31052, Fig. 63) is described further below with 
reference to Fig. 64. 

[0479] At step #31060 the reading head 2006 is 
jumped to the first cell VOBU start address 
C_FVOBU_SA read from the C_FVOBU_SA_reg regis- 
ter. 

[0480] More specifically, the address data 

C_FVOBU_SA_reg stored in the decoding system con- 
troller 2300 (Fig. 26) is input as bitstream reproduction 
control signal St53 to the reproduction controller 2002. 
The reproduction controller 2002 thus controls the 
recording media drive unit 2004 and signal processor 
2008 to move the reading head 2006 to the specified 
address, data is read, error correction code ECC and 
other signal processing is accomplished by the signal 
processor 2008, and the eel! start VOBU data is output 
as the reproduced bitstream St61 to the stream buffer 
2400. The procedure then moves to step #31062. 
[0481] At step #31 062 the DSI packet data in the nav- 
igation pack NV (Fig. 20) is extracted in the stream 
buffer 2400, the decoding table is defined, and the pro- 
cedure moves to step #31064. The registers set in the 
decoding table are the ILVU_EA_reg, 
NTJLVU_SA_reg, VOB_V_SPTM_reg, 
VOB_V_EPTM_reg. VOB_A_STP_PTlv11_reg. 
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VOB_A_STP_PTM2_reg, VO B_A_G A P_LE N 1 _reg, 
and VOB_A_GAP_LEN2_reg. 

[0482] At step #31064 the data from the first cell 
VOBU start address C_FVOBU_SA_reg to the ILVU 
end pack address ILVU_EA_reg, i.e., the data for one s 
interleaved unit ILVU, is transferred to the stream buffer 
2400. The procedure then moves to step #31066. 
[0483] More specifically, the address data 

ILVUJ=A_reg stored in the decoding system controller 
2300 (Fig. 26) is supplied to the reproduction controller 10 
2002. The reproduction controller 2002 thus controls 
the recording media drive unit 2004 and signal proces- 
sor 2008 to read the data to the ILVU_EA_reg address, 
and after error correction code ECC and other signal 
processing is accomplished by the signal processor is 
2008, the data for the first ILVU in the cell is output as 
the reproduced bitstream St61 to the stream buffer 
2400. It is thus possible to output the data for one con- 
tiguous interleaved unit ILVU on the recording medium 
M to the stream buffer 2400. 20 
[0484] At step #31066 it is determined whether all 
interleaved units in the interleaved block have been 
read and transferred. If the interleaved unit ILVU proc- 
essed is the last ILVU in the interleaved block, 
"0x7FFFFFFP indicating termination is set to the next- 25 
ILVU start address NT_ILVU_SA_reg as the next read 
address. If all interleaved units in the interleaved block 
have thus been processed, the procedure moves to 
step #31068. 

[0485] At step #3 1 068 the reading head 2006 is again 30 
jumped to the address NT_I LVU_SA_reg of the next 
interleave unit to be reproduced, and the procedure 
loops back to step #31062. Note that this jump is also 
accomplished as described above, and the loop from 
step #31062 to step #31068 is repeated. 35 
[0486] However, if step #31066 returns YES, i.e.. all 
interleaved unit ILVU in the interleaved block have been 
transferred, step #31052 terminates. 
[0487] The non-multi-angle interleaved block process 
(step #31052) thus transfers the data of one cell to the 40 
stream buffer 2400. 

[0488] The non-multi-angle contiguous block process 
is executed in step #31054, Fig. 63, is described further 
below with reference to Fig. 65. 

[0489] At step #31070 the reading head 2006 is 45 
jumped to the first cell VOBU start address 
C_FVOBU_SA read from the C_FVOBU_SA_reg regis- 
ter. This jump is also accomplished as described above; 
and the loop from step #31072 to step #31076 is initi- 
ated, so 
[0490] At step #3 1 072 the DSI packet data in the nav- 
igation pack NV (Fig. 20) is extracted in the stream 
buffer 2400, the decoding table is defined, and the pro- 
cedure moves to step #31074. The registers set in the 
decoding table are the VOBU_EA_reg, 55 
VOB_V_SPTM_reg, VOB_V_EPTM_reg, 
VOB_A_STP_PTM 1_reg, VOB_A_STP_PTM2_reg, 
VOB_A_GAP_LEN1_reg, and 



VOB_A_GAP_LEN2_reg. 

[0491] At step #31074 the data from the first cell 
VOBU start address C_FVOBU_SA_reg to the end 
pack address VOBU_EA_reg, i.e., the data for one 
video object unit VOBU, is transferred to the stream 
buffer 2400. The procedure then moves to step #31076. 
The data for one video object unit VOBU contiguously 
arrayed to the recording medium M can thus be trans- 
ferred to the stream buffer 2400. 
[0492] At step #31076 it is determined whether all cell 
data has been transferred. If all VOBU in the cell has not 
been transferred, the data for the next VOBU is read 
continuously, and the process loops back to step 
#31070. 

[0493] However, if all VOBU data in the cell has been 
transferred as determined by the C_LVOBU_SA_reg 
value in step #31076, the non-multi-angle contiguous 
block process (step #31054) terminates. This process 
thus transfers the data of one cell to the stream buffer 
2400. 

Decoding flows in the stream buffer 

[0494] The process for decoding data in the stream 
buffer 2400 shown as step #31034 in Fig. 61 is 
described below referring to Fig. 66. This process (step 
#31034) comprises steps #31110, #31112, #31114, 
and #31116. 

[0495] At step #31110 data is transferred in pack units 
from the stream buffer 2400 to the system decoder 2500 
(Fig. 26). The procedure then moves to step #31 1 12. 
[0496] At step #31112 the pack data is from the 
stream buffer 2400 to each of the buffers, i.e., the video 
buffer 2600, sub-picture buffer 2700, and audio buffer 
2800. 

[0497] At step #31 1 12 the Ids of the user-selected 
audio and sub-picture data, i.e., the audio ID 
AUDIOJD_reg and the sub-picture ID SP_ID_reg 
stored to the scenario information register shown in Fig. 
58, are compared with the stream ID and sub-stream ID 
read from the packet header (Fig. 19), and the matching 
packets are output to the respective buffers. The proce- 
dure then moves to step #31114. 
[0498] The decode timing of the respective decoders 
(video, sub-picture, and audio decoders) is controlled in 
step #31 1 14, i.e., the decoding operations of the decod- 
ers are synchronized, and the procedure moves to step 
#31116. 

[0499] Note that the decoder synchronization process 
of step #31 1 1 4 is described below with reference to Fig. 
15. 

[0500] The respective elementary strings are then 
decoded at step #31 116. The video decoder 3801 thus 
reads and decodes the data from the video buffer, the 
sub-picture decoder 3100 reads and decodes the data 
from the sub-picture buffer, and the audio decoder 3200 
reads and decodes the data from the audio buffer. 
[0501 ] This stream buffer data decoding process then 
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terminates when these decoding processes are com- 
pleted. 

[0502] The decoder synchronization process of step 
#31114, Fig. 66, is described below with reference to 
Fig. 15. This processes comprises steps #31120, 
#31122, and #31124. 

[0503] At step #31120 it is determined whether a 
seamless connection is specified between the current 
cell and the preceding cell. If a seamless connection, 
the procedure moves to step #31122, if not, the proce- 
dure moves to step #31 124. 

[0504] A process synchronizing operation for produc- 
ing seamless connections is executed in step #31122. 
and a process synchronizing operation for non-seam- 
less connections is executed in step #31 124. 

System encoder 

[0505] In the embodiment described below, plural 
buffers, including a stream buffer 2400, video buffer 
2600, audio buffer 2800, and reordering buffer 3300 as 
shown in Fig. 26, are used for the single time-share con- 
trolled buffer of the DVD decoder DCD in the present 
invention. 

[0506] Note that in the following description the actual 
buffer means made from semiconductor memory 
devices or similar physical means are referred to as 
"physical buffers," and the buffer means to which differ- 
ent data are stored by time-share controlled use of the 
physical buffers are referred to as "functional buffers." 
Note that sub-picture data decoding is completed 
instantaneously, and the load imposed on DVD decoder 
DCD operation can thus be ignored in comparison with 
load imposed by the encoded audio and video streams. 
The description of the present embodiment below is 
therefore limited to a single encoded video stream and 
a single encoded audio stream for simplicity. 
[0507] Shown in Fig. 39 are the simulated results of 
data input/output to the video buffer 2600 and audio 
buffer 2800 of the DVD decoder DCD, and the sequence 
in which the DVD encoder ECD multiplexes the 
encoded video stream St27 and the encoded audio 
stream St31 to generate the corresponding bitstream. 
Note that the progression of time is shown on the hori- 
zontal axis T. 

[0508] The frame G1 shown at the top row in Fig. 39 
shows the packetizing of the encoded video stream 
St27 by the DVD encoder ECD. Each block V in frame 
G1 indicates a video packet V. The vertical axis indi- 
cates the input transfer rate to the video buffer 2600, 
and the horizontal axis, time-base X indicates the trans- 
fer time. The area of each video packet represents the 
data size of the packet. The audio packets A are simi- 
larly shown with the area of the audio packet also indi- 
cating the packet size. Note, however, that while the 
audio packets appear to be larger than the video pack- 
ets V, i.e., contain more data, the audio packets and 
video packets are all the same size. 



[0509] Data input/output to the video buffer 2600 of 
the DVD decoder DCD is shown on the second row of 
Fig. 39. The vertical axis Vdv here indicates the accu- 
mulated video data volume Vdv in the video buffer 2600. 
5 [051 0] More specifically, the first video packet V in the 
encoded video stream St71 input to the video buffer 
2600 is input at time Tb1 . The last video packet V in the 
encoded video stream St71 is input at time Tvf. Line SVi 
thus indicates the change in the video data volume Vdv 
w accumulated in the video buffer 2600 at the front of the 
encoded video stream St71 , and line SVf indicates the 
change in the video data volume Vdv accumulated in 
the video buffer 2600 at the end of the encoded video 
stream St71. Thus, the slopes of lines SVi and SVf indi- 
15 cate the input rate to the video buffer 2600. Line BCv 
indicates the maximum accumulation capacity (storage 
capacity) of the video buffer 2600. 
[0511] Note that lines BCv and BCa are determined 
based on data written to the system stream header 
20 according to the MPEG standard. 

[051 2] The accumulated video data volume Vdv in the 
video buffer 2600 increases linearly, and at time Td1 the 
first block d1 of video data is batch transferred in a first- 
in first-out (FIFO) fashion to the video decoder 3801 
25 whereby it is consumed for decoding. As a result, the 
accumulated video data volume Vdv is reduced to (BCv 
- d1), and then continues to accumulate. Note that while 
this example shows the accumulated video data volume 
Vdv at time Td1 to have reached the maximum storage 
30 capacity BCv of the video buffer 2600, it is not neces- 
sary for the accumulated video data volume Vdv to have 
reached the maximum storage capacity BCv when 
decoding begins, and may obviously be less than the 
maximum storage capacity BCv. 
35 [0513] Part of the data dl transferred to the video 
buffer 2600, specifically the data at point B at the top 
end of the dotted line having the same slope as line SVi 
and intersecting the time-base at intersection tb, was 
data input at time Tb. Thus, the data block d1 first 
40 decoded is the data input between time Tbl to Tb2. Fur- 
thermore, when data input time Tb2 is later than decod- 
ing time Td1 , a data underflow state occurs in the video 
buffer 2600 at time Td1 . 

[0514] The variation in the per-picture encoded data 
45 quantity is great in an MPEG-compressed encoded 
video stream, and temporary depletion of large 
amounts of encoded data may occur. To prevent a data 
underflow state from occurring in the video buffer in 
such cases, it is necessary to write as much data as 
so possible to the video buffer 2600. The time required for 
data transfer is thus called the video buffer verifier delay 
vbv_delay. 

[0515] The third row in Fig. 39 shows the audio data 
packetizing process. As with the video data packets in 
55 the first row, the frames A indicate the audio packets A, 
the size of which is equal to the size of the video packets 

V. 

[0516] The fourth row simulates the results of data 
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input/output to the audio buffer 2800 similar to the 
results of data input/output to the video buffer 2600 in 
the second row. The vertical axis here indicates the 
accumulated audio data volume Vda in the audio 
stream buffer 2800. 

[0517] Note that in Fig. 39 time Tvp1 is the video pres- 
entation start time, Tapl is the audio presentation start 
time, Fv is the video frame reproduction time, and Fa is 
the audio frame reproduction time Fa. 
[0518] At time Tad1, the first audio packet A in the 
encoded audio stream St75 is input to the audio buffer 
2800. Line SAi thus indicates the change in the audio 
data volume Vda accumulated in the audio buffer 2800 
at the front of the encoded audio stream St75, and line 
SAf indicates the change in the audio data volume Vda 
accumulated in the audio buffer 2800 at the end of the 
encoded audio stream St75. Thus, the slopes of lines 
SAi and SAf indicate the input rate to the audio buffer 
2800. Line BCa indicates the maximum accumulation 
capacity (storage capacity) of the 2800. Note that the 
maximum storage capacity BCa is obtained in the same 
manner as the maximum storage capacity BCv of the 
video buffer 2600. 

[0519] The audio access unit, i.e., the audio frame 
(which is also the audio compression unit), is generally 
constant in the audio stream. A data overflow state 
occurs in the audio buffer 2800 if the encoded audio 
stream St75 is input to the audio buffer 2800 in a short 
period at a rate exceeding the consumption rate, and 
the input volume thus exceeds the maximum storage 
capacity BCa of the audio buffer 2800. When this hap- 
pens, the next audio packet A cannot be input until 
audio data stored in the audio buffer 2800 is consumed, 
i.e., decoded. 

[0520] Furthermore, because the video packets V and 
audio packets A are contiguous in a single brtstream, 
the following video packet V cannot be input to the video 
buffer 2600 even through the video buffer 2600 itself is 
not in a data overflow state if a data overflow state 
occurs in the audio buffer 2800. Thus, a data overflow 
state in the audio buffer 2800 may create a data under- 
flow state in the video buffer 2600 depending on the 
duration of the data overflow state. 
[0521 ] Therefore, to prevent an audio buffer overflow, 
data input to the audio buffer 2800 is restricted when the 
sum of the data accumulated in the audio buffer and the 
data size of the packet exceeds the maximum audio 
buffer capacity. More specifically, the present embodi- 
ment transfers only the packet(s) containing the (frame) 
data required by the audio decode time, and does not 
permit inputting more than the required amount of data 
to the audio buffer. However, because of the difference 
in the data size of the packets (approx. 2 KB) and the 
audio frame (1536 bytes at 384 Kbps with Dolby AC-3 
coding), the data for the frame following the current 
frame is simultaneously input. 

[0522] Thus, as shown by the audio data packet 
stream (row three, Fig. 39 and the audio buffer 



input/output timing (row four, Fig. 39). only approxi- 
mately one audio frame of data is input to the audio 
buffer 2800 before the audio decode time. 
[0523] Because of the characteristics of an MPEG- 

s compressed video stream, decoding normally starts at 
video frame reproduction time Fv before the first video 
presentation start time Tvpl , and the audio data is input 
to the audio buffer 2800 at audio frame reproduction 
time Fa before decoding starts, i.e.. before audio pres- 

io entation start time Tap1 . The video stream is thus input 
to the video buffer 2600 approximately (video buffer ver- 
ifier delay vbv_delay + video frame reproduction time Fv 
- audio frame reproduction time Fa) before audio stream 
input begins. 

is [0524] The fifth row in Fig. 39 shows the results of 
interleaving the video packet stream G1 (row 1) with the 
audio packet stream G2 (row 3). The audio and video 
packets are interleaved by multiplexing referenced to 
the respective input times to the audio and video buff- 

20 ers. 

[0525] For example, Tb1 is the index for the buffer 
input time of the first pack in the encoded video stream, 
and Tal is the index for the buffer input time of the first 
pack in the encoded audio stream. The packed data is 

25 then multiplexed referenced to the buffer input time of 
the data in the packs to the audio and video buffers. 
Because the encoded video stream is thus input to the 
video buffer 2600 at approximately the vbv_delay plus 
one video frame minus one audio frame, plural video 

30 frames are contiguous at the front of the system stream. 
There is a similar series of audio packets at the end of 
the system stream equivalent to approximately the lead 
time at which the video stream is buffered before the 
encoded audio stream. 

35 [0526] Note again that a data overflow state occurs in 
the audio buffer 2800 if the encoded audio stream St75 
is input to the audio buffer 2800 in a short period at a 
rate exceeding the consumption rate, and the input vol- 
ume thus exceeds the maximum storage capacity BCa 

40 of the audio buffer 2800. When this happens, the next 
audio packet A cannot be input until audio data stored in 
the audio buffer 2800 is consumed, i.e.. decoded. Gaps 
therefore occur at the end of the system stream when 
only the audio packets are being transferred. 

45 [0527] For example, if the video bit rate is 8 Mbps, the 
video buffer capacity is 224 KB, and 224 KB of video 
data are buffered before video decoding starts in the 
DVD system, the video buffer verifier delay vbv_delay 
will be approximately 219 msec. If NTSC video and AC- 

50 3 audio coding are used, one NTSC video frame is 
approximately 33 msec, and one AC-3 audio frame is 
approximately 32 msec. At the head of the system 
stream in this example the video stream leads the audio 
stream by approximately 220 msec (=219 msec + 33 

55 msec - 32 msec), and video packets are arrayed contig- 
uously for this period. 

[0528] The audio packets continue in a similar series 
at the end of the system stream for the lead time of the 
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encoded video stream to the encoded audio stream. 
[0529] By thus producing and recording the system 
streams, audio and video reproduction can be accom- 
plished without creating a data underflow state in the 
video buffer of the DVD decoder shown in Fig. 26. 
[0530] Movies and other titles can be recorded to an 
optical disk by a DVD system using this type of MPEG 
system stream. However, if plural titles implementing 
parental lock control, director's cut selections, and other 
features are recorded to a single optical disk, it may be 
necessary to record ten or more titles to the disk. This 
may require the bit rate to be dropped with the incum- 
bent loss of image quality. 

[0531 ] However, by sharing the system streams com- 
mon to plural titles, e.g., titles implementing parental 
lock control, director's cut selections, and other fea- 
tures, and discretely recording for each of the plural 
titles only those scenes that are unique to those titles, it 
is possible to record plural different titles to a single opti- 
cal disk without reducing the bit rate, and thereby with- 
out loss of image quality. This method thus makes it 
possible, for example, to record plural titles for different 
countries, cultures, or language groups to a single opti- 
cal disk without reducing the bit rate and therefore with- 
out loss of image quality. 

[0532] An example of a title stream providing far 
parental lock control is shown in Fig. 40. When so- 
called "adult scenes'* containing sex, violence, or other 
scenes deemed unsuitable for children are contained in 
a title implementing parental lock control, the title 
stream is recorded with a combination of common sys- 
tem streams SSa. SSb, and SSe, an adult-oriented sys- 
tem stream SSc containing the adult scenes, and a 
child-oriented system stream SSd containing only the 
scenes suitable for children. Title streams such as this 
are recorded as a multi-scene system stream contain- 
ing the aduri-oriented system stream SSc and the child- 
oriented system stream SSd arrayed to the multi-scene 
period between common system streams SSb and SSe. 
[0533] The relationship between each of the compo- 
nent titles and the system stream recorded to the pro- 
gram chain PGC of a title stream thus comprised is 
described below. 

[0534] The adult-oriented title program chain PGC1 
comprises in sequence the common system streams 
SSa and SSb, the adult-oriented system stream SSc, 
and the common system stream SSe. The child-ori- 
ented title program chain PGC2 comprises in sequence 
the common system streams SSa and SSb. the child- 
oriented system stream SSd, and the common system 
stream SSe. 

[0535] To share system streams within titles compris- 
ing multi-scene periods such as this, and to divide the 
system stream as needed for authoring, it is also neces- 
sary to be able to connect and contiguously reproduce 
these system streams. When system streams are con- 
nected and contiguously reproduced, however, pauses 
in the video presentation (freezes) occur at the system 
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stream connections, and seamless reproduction pre- 
senting a natural flow of a single title can be difficult to 
achieve. 

[0536] Data input/output to the video buffer 2600 of 

5 the DVD decoder DCD shown in Fig. 26 during contigu- 
ous reproduction is shown in Fig. 41 . In Fig. 41 , block 
Ga shows the data input/output to the video buffer 2600 
when encoded video stream Sva and encoded video 
stream Svb are input to the DVD decoder DCD. Block 

io Gb shows the video packet streams of encoded video 
stream Sva and encoded video stream Svb. Block Gc 
shows the interleaved system streams Sra and Srb. 
Note that blocks Ga, Gb, and Gc are arranged refer- 
enced to the same time-base T as that shown in Fig. 39. 

is [0537] In block Ga the vertical axis shows the accumu- 
lated video data volume Vdv in the video buffer, and 
slope Sva indicates the input rate to the video buffer 
2600. Where the video data volume Vdv accumulated in 
the video buffer 2600 is shown to decrease in block Ga 

20 therefore indicates data consumption, i.e., that data has 
been output for decoding. 

[0538] Time T1 also indicates the input end time of the 
last video packet V1 in the system stream Sra (block 
Gc), time T3 indicates the input end time of the last 
25 audio packet A1 in system stream Srb, and time Td indi- 
cates the first decode time of encoded video stream Svb 
(block Ga). 

[0539] Of the two system streams, the encoded video 
stream Sva and the encoded audio stream Saa, consti- 

30 tuting system stream Sra, the encoded video stream 
Sva is input to the video buffer 2600 before the encoded 
audio stream Saa is input to the audio buffer 2800 as 
described above. A series of audio packets A therefore 
remains at the end of the system stream Sra. 

35 [0540] A data overflow state also occurs in the audio ' 
buffer 2800 if audio packets A exceeding the capacity of 
the audio buffer 2800 are input thereto. When this 
occurs, the next audio packet cannot be buffered until 
an equivalent amount of audio data is consumed, i.e., 

40 decoded. 

[0541 ] The first video packet V2 in system stream Srb 
therefore cannot be input to the video buffer 2600 until 
input of the last audio packet A1 in the system stream 
Sra is completed. As a result, video stream input to the 

45 video buffer 2600 cannot be continued due to the inter- 
ference from audio packet A1 during the period from T1 , 
the input end time of the last video packet V1 in system 
stream Sra, to T3, the input end time of the last audio 
packet A1 in system stream Sra. 

so [0542] In the following example it is assumed that the 
video bit rate of the DVD system is 8 Mbps, the video 
buffer capacity is 224 KB, the audio buffer capacity is 4 
KB, the audio data is encoded with Dolby AC-3 com- 
pression, and the compression bit rate is 384 Kbps. In 

55 AC-3 audio compression, the reproduction time of one 
audio frame is 32 msec, corresponding to a data size of 
1536 bytes/frame, and two audio frames can therefore 
be stored in the audio buffer. 
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[0543] Because the number of audio frames that can 
be stored in the audio buffer is two, the earliest possible 
time T3, which is the input end time of the last audio 
packet A1 in system stream Sra, is at the (reproduction 
start time of the last audio frame in system stream Sra) 
- (reproduction time of two audio frames). The reproduc- 
tion start time of the last audio frame in system stream 
Sra is also approximately one audio frame earlier than 
the presentation start time of the first frame in the 
encoded video stream Svb of system stream Srb. The 
presentation start time of encoded video stream Svb is 
at the video buffer verifier delay vbv_delay plus one 
video frame after the input end time T1 of the last video 
packet V1 in system stream Sra. 
[0544] Therefore, if 224 KB of video data is buffered 
by the time video decoding starts, the video buffer veri- 
fier delay vbv_delay is approximately 219 msec. If 
NTSC video and AC-3 audio coding are used, one 
NTSC video frame is approximately 33 msec, and one 
AC-3 audio frame is approximately 32 msec. Thus, 
there is approximately 156 msec (=219 msec + 33 
msec - 32 msec - 2 x 32 msec) from the input end time 
T1 of the last video packet V1 in system stream Sra to 
the input end time T3 of the last audio packet A1 in sys- 
tem stream Sra. The encoded video stream Svb cannot 
be input to the video buffer 2600 during this approxi- 
mately 156 msec period. 

[0545] Therefore, because all decode data d1 is not 
input to the video buffer 2600 at time Td, a data under- 
flow state occurs in the video buffer 2600. In such cases 
the video presentation is intermitted, video freezing 
occurs, and correct video presentation is interrupted. 
[0546] Thus, when plural system streams are con- 
nected and contiguously decoded to reproduce a single 
contiguous sequence of scenes from plural titles com- 
prising a commonly shared system stream and plural 
system streams containing content discretely encoded 
for specific titles, video freezing apparent as pauses in 
the video presentation at system stream connections 
can occur, and it is not always possible to seamlessly 
reproduce such plural system streams as a single con- 
tiguous title. 

[0547] When plural different system streams SSc and 
SSd are connected to one common system stream SSe 
as shown in Fig. 40, a time difference occurs between 
the video reproduction time and the audio reproduction 
time because of the offset between the audio and video 
frame reproduction times, and this time difference var- 
ies according to the reproduction path. As a result 
buffer control fails at the connection, video reproduction 
freezes or the audio reproduction is muted, and seam- 
less reproduction is not possible. 
[0548] This problem is considered below with refer- 
ence to Fig. 42 as it applies to the parental lock control 
example shown in Fig. 40. In Fig. 42 SScv and SSca 
represent the reproduction times of the video and audio 
frame unit streams in adult-oriented system stream 
SSc. SSdv and SSda similarly represent the reproduc- 



tion times of the video and audio frame unit streams in 
the child-oriented system stream SSd. 
[0549] As described above, if NTSC video and AC-3 
audio coding are used, one NTSC video frame is 

5 approximately 33 msec, and one AC-3 audio frame is 
approximately 32 msec, and the audio and video repro- 
duction times therefore do not match. As a result, a dif- 
ference occurs in the video reproduction time, which is 
an integer multiple of the video frame reproduction time, 

w and the audio reproduction time, which is an integer 
multiple of the audio frame reproduction time. This 
reproduction time different is expressed as Tc in the 
adult-oriented system stream SSc, and time Td in the 
child-oriented system stream SSd. This difference also 

is varies according to the change in the reproduction time 
of the reproduction paths, and Tc _ Td. 
[0550] Therefore, when plural system streams are 
connected with a single system stream as described 
above with parental lock control and director's cut titles, 

20 there is a maximum reproduction gap of one frame in 
the audio and video reproduction times at the points 
where the system streams branch and connect. 
[0551] This reproduction gap is described next with 
reference to Fig. 43. The top program chain PGC1 rep- 

25 resents the reproduction path of the adult-oriented sys- 
tem stream. SScv and SSev represent the reproduction 
times of the video frame unit streams in adult-oriented 
system stream SSc and common system stream SSe, 
and SSca and SSea represent the reproduction times of 

30 the audio frame unit streams in adult-oriented system 
stream SSc and common system stream SSe. 
[0552] These frame unit reproduction times are 
expressed in the figure by the line segments ended with 
arrows on both ends. 

35 [0553] The video stream SScv of the adult-oriented 
system stream SSc in this example ends after 3 frames, 
and is followed by the common system stream SSe 
starting at frame 4 with the first frame of the video 
stream SSev. The audio stream SSca likewise ends at 

40 frame 4, and the first frame of the common audio stream 
SSea starts from frame 5. The resulting difference in the 
frame reproduction times between the audio and video 
streams produces a reproduction gap of time Tc equiv- 
alent to a maximum one frame between the video 

45 stream and the audio stream when these two streams 
SSc and SSe are connected. 

[0554] The bottom program chain PGC2 similarly rep- 
resents the reproduction path of the child-oriented sys- 
tem stream. SSdv and SSev represent the reproduction 

so times of the video frame unit streams in child-oriented 
system stream SSd and common system stream SSe, 
and SSda and SSea represent the reproduction times of 
the audio frame unit streams in child-oriented system 
stream SSd and common system stream SSe. 

55 [0555] As with the adult-oriented program chain PGC1 
above, a reproduction gap of time Td equivalent to a 
maximum one frame between the video stream and the 
audio stream occurs when these two streams SSd and 
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SSe are connected. When the reproduction paths to the 
common system streams differ before the connection 
point as shown in Fig. 43, it is possible to adjust the 
reproduction start times of the connected common 
audio and video streams to the reproduction start time 5 
difference of at least one reproduction path. As shown 
in this figure, the audio and video end times of the adult* 
oriented system stream SSc are the same as the audio 
and video start times of the common system stream 
SSe, i.e., a gap-less connection is achieved. Note that w 
in this example the gap Td of the child-oriented system 
stream SSd is less than the gap Tc of the adult-oriented 
system stream SSc (Td < Tc). 

[0556] The one program chain PGC1, i.e., adult-ori- 
ented system stream SSc and common system stream is 
SSe, is thus reproduced without a reproduction gap, but 
program chain PGC2, i.e., child-oriented system stream 
SSd and common system stream SSe, is reproduced 
with an audio reproduction gap of Tc - Td. Thus, even 
when connecting from plural reproduction paths (SSc 20 
and SSd) to one system stream (SSe), it is possible to 
eliminate any reproduction gap in the video or audio on 
at least one reproduction path. 

[0557] The third row in Fig. 43 shows the change in 
audio buffer storage during continuous reproduction of 25 
program chain PGC2, i.e., child-oriented system stream 
SSd and common system stream SSe. The per-frame 
reproduction time of the audio stream in the audio buffer 
is indicated by the arrows. Note that system streams 
SSd and SSe are connected with an audio reproduction 30 
gap of Tc - Td, the difference between the reproduction 
time difference Tc of PGC1 and the reproduction time 
difference td of PGC2, at the connection. 
[0558] However, because DVD players normally syn- 
chronize audio and video output referenced to the audio 35 
signal, the audio frames are output continuously. As a 
result, the audio reproduction gap Tc - Td is not repro- 
duced as a gap during reproduction, and audio repro- 
duction is therefore contiguous. 

[0559] The common system stream SSe is encoded 40 
so that the audio is reproduced, i.e.. decoded, at a delay 
of Tc to the video. As a result, when the audio is repro- 
duced, i.e., decoded, so that there is no audio reproduc- 
tion gap Tc - Td, audio decoding is accomplished before 
the audio data is completely input to the audio buffer, 45 
and a data underflow state results in the audio buffer as 
shown by line Lu in Fig. 43. 

[0560] When the audio reproduction is contiguous and 
a reproduction gap is inserted between video frames, a 
data underflow state occurs in the video buffer due to 50 
video stream reproduction, similarly to video stream 
reproduction being interrupted as shown in Fig. 41 . 
[0561 ] As thus described, when plural difference sys- 
tem streams and one common system stream are con- 
nected, a difference occurs between the video 55 
reproduction time and the audio reproduction time of the 
respective paths due to the offset in the audio and video 
frame reproduction times. 



[0562] The present invention therefore provides a 
recording method and apparatus and a reproduction 
method and apparatus whereby a video or audio buffer 
underflow state is prevented at the system stream con- 
nections, and seamless reproduction in which pauses in 
the video reproduction (freezes) or pauses in the audio 
reproduction (muting) do not occur. 
[0563] A method of connecting a single common sys- 
tem stream to the plural system streams contained in 
the mufti-scene period of a title stream as shown in Fig. 
40 is described below according to the present inven- 
tion. The physical structure of the optical disk M, the 
overall data structure of the optical disk M, and the 
structures of the DVD encoder ECD and DVD decoder 
DCD in this embodiment are as previously described 
with reference to Figs. 4-14, Figs. 1,16 - 20,22, Figs. 
25,27 - 29, and Figs. 26 above, and further description 
thereof is thus omitted below. 

[0564] There are two data transfer models under the 
MPEG standard: constant bit rate (CBR) whereby data 
is transferred continuously without interruptions, and 
variable bit rate (VBR) whereby data is transferred inter- 
mittently with interruptions in the transfer. For simplicity, 
the present embodiment is described below using the 
CBR model only. 

[0565] Referring first to Figs. 44, 45, and 46, a simple 
one-to-one system stream connection between first and 
second common system streams SSa and SSb is 
described first. For simplicity the following description is 
restricted to operation using one video stream SSav 
and one audio stream SSba. 

[0566] The system streams produced according to the 
present invention are shown in Fig. 44, the operation 
whereby these system streams are connected is shown 
in Fig. 45, and the method of generating the system 
streams is shown in Fig. 46. 

[0567] The structure of the tail of the leading common 
system stream SSa, and the head of the following com- 
mon system stream SSb, recorded to the optical disk M 
are shown in Fig. 44. 

[0568] In Fig. 44 are shown the structure of the end of 
the preceding common system stream SSa and the 
common system stream SSb following thereafter. Note 
that both system streams SSa and SSb are recorded to 
the optical disk M. 

[0569] The fifth row block Ge shows the structure of 
both system streams SSa and SSb. The first common 
system stream SSa comprises video stream SSav and 
audio stream SSaa; the second common system 
stream SSb similarly comprises video stream SSbv and 
audio stream SSba. 

[0570] The fourth row Gd shows the audio packet 
streams A of audio stream SSaa and audio stream 
SSba extracted from system stream SSa and system 
stream SSb. 

[0571] The third row Gc shows the data input/output 
state of the audio buffer 2800 when audio stream SSaa 
and audio stream SSba are input to the DVD decoder 



40 

CID:<EP 0930782A1J > 



79 



EP 0 930 782 A1 



80 



DCD shown in Fig. 26. 

[0572] The second row Gb shows the video packet 
streams V of video stream SSav and video stream SSbv 
extracted from system stream SSa and system stream 
SSb. 5 
[0573] The first row Ga shows the data input/output 
state of the video buffer 2600 when video stream SSav 
and video stream SSbv are input to the DVD decoder 
DCD shown in Fig. 26. 

[0574] Note that Ga, Gb, Gc, Gd, and Ge are all refer- 
enced to the same time-base (direction T). 
[0575] Tvae in Fig. 44 is the input end time of the 
video stream SSav to the video buffer 2600, and Taae is 
the input end time of the audio stream SSaa to the audio 
buffer 2800. 

[0576] When system stream SSa is input to the DVD 
decoder DCD, the difference between the input end 
times Tvae and Taae of the video stream SSav and 
audio stream SSaa to the respective buffers 2600 and 
2800 is small, and is less than the reproduction time of 
two audio frames. As a result, the last audio packet A 
can be accumulated in the audio buffer 2800 before 
input of the audio and video streams in the next system 
stream starts. 

[0577] Likewise, when system stream SSb is input to 25 
the DVD decoder DCD, the difference between the input 
start times of the video stream SSbv and audio stream 
SSba to the respective buffers 2600 and 2800 is small, 
and is less than the reproduction time of two audio 
frames. 30 
[0578] The data input/output state to the video buffer 
2600 when system streams SSa and SSb (Fig. 44) 
stored to the optical disk M are connected and contigu- 
ously reproduced. 

[0579] The top row in Fig. 45 shows the data input/out- 35 
put state of the video buffer 2600 when video stream 
SSav and video stream SSbv are input continuously to 
the DVD decoder DCD. 

[0580] As in Fig. 39. Fig. 41, and Fig. 44, the vertical 
axis Vdv indicates the accumulated video data volume 40 
Vdv in the video buffer 2600, and the horizontal axis 
indicates time T Lines SSav and SSbv indicate the 
change in the video data volume Vdv accumulated in 
the video buffer 2600, and the slopes of the lines indi- 
cate the input rate to the video buffer 2600. Where the 45 
accumulated video data volume Vdv in the video buffer 
2600 drops indicates data consumption, i.e., that 
decoding has occurred. 

[0581] The second row shows the video packet 
streams in the video streams SSav and SSbv shown in so 
Fig. 26. 

[0582] The third row shows the system streams SSa 
and SSb according to the present embodiment. Time T1 
is the input end time of the last video packet V1 in sys- 
tem stream SSa, time T2 indicates the input start time of 55 
the first v] V2 in system stream SSb, and time T indi- 
cates the decoding start time of system stream SSb. 
[0583] The difference between the input end times to 



the video buffer 2600 and audio buffer 2800 of the video 
stream SSav and the audio stream SSaa forming the 
system stream SSa of the present embodiment is 
reduced by the system stream production method 
shown in Fig. 46. As a result, interference with the input 
of system stream SSb resulting from a succession of 
remaining audio packets A at the end of system stream 
SSa does not occur. The difference between the input 
end time T1 of the first video packet V1 of system 
stream SSa and the input start time T2 of the first video 
packet V2 in system stream SSb is small, there is suffi- 
cient time from the input start time T2 of video packet V2 
to the first decode time Td of the video stream SSbv, 
and the video buffer 2600 therefore does not underflow 
at time Td. 

[0584] Unlike with the system stream shown in Fig. 
41 , the audio buffer 2800 therefore does not overflow at 
the end of the system stream, i.e.. there is no interfer- 
ence with inputting the encoded video stream of the 
next system stream, when connecting and contiguously 
reproducing system streams SSa and SSb according to 
the present embodiment, and seamless reproduction 
can be achieved. 

[0585] A first method of producing a first common sys- 
tem stream SSa and a second common system stream 
SSb connected thereafter is described below with refer- 
ence to Fig. 46. Note that as in Fig. 44, the structure of 
the tail of the leading common system stream SSa, and 
the head of the following common system stream SSb, 
recorded to the optical disk M are shown in Fig. 46. 
[0586] The first row in Fig. 46 corresponds to block Ga 
in Fig. 44, and simulates the data input/output of video 
stream SSav and video stream SSbv to the video buffer 
2600. Time T1 is the input end time of all data in the 
video stream SSav. 

[0587] The second row similarly corresponds to block 
Gb in Fig. 44, and shows the video data packet stream. 
[0588] The third row similarly corresponds to block Gc 
in Fig. 44, and simulates the data input/output of audio 
stream SSaa and audio stream SSba to the audio buffer 
2800. 

[0589] The fourth row similarly corresponds to block 
Gd in Fig. 44, and shows the audio data packet stream. 
[0590] The fifth row similarly corresponds to block Ge 
in Fig. 44, and shows the system stream resulting from 
interleaving and packing the video packets V shown in 
the second row and the audio packets A shown in the 
fourth row. The video packets and audio packets are 
interleaved in a FIFO manner from the video and audio 
buffers referenced to the packet input time to the 
respective buffer. In other words, the packed data is 
multiplexed referenced to the time the data contained in 
the pack is input to the video or audio buffer. 
[0591] The method of generating the first common 
system stream and the following second common sys- 
tem stream is described next. 

[0592] It is assumed below that the video bit rate is 8 
Mbps, the video buffer capacity is 224 KB, the audio 
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buffer capacity is 4 KB, the audio data is encoded with 
Dolby AC-3 compression, and the compression bit rate 
is 384 Kbps. In AC-3 audio compression, the reproduc- 
tion time of one audio frame is 32 msec, corresponding 
to a data size of 1536 bytes/frame, and two audio 
frames can therefore be stored in the audio buffer. 
[0593] Referenced to the input end time T1 of the 
video stream SSav to the video buffer 2600, the audio 
frame data following the current audio frame is moved to 
the audio stream SSba at time T1 to accumulate one 
audio frame in the audio buffer. This operation is 
described in detail below referring to the simulation 
results shown in row 3 of Fig. 46. 
[0594] Specifically, two audio frames (= 1536 bytes) 
from the encoded audio stream SSaa are accumulated 
in the audio buffer (4 KB capacity) at time T1, and the 
third to sixth audio frames following thereafter, indicated 
by frame Ma in Fig. 46, are moved to the beginning of 
the following encoded audio stream SSba. Note that the 
encoded audio stream is moved in audio frame units 
because the audio frame is the unit of reproduction. 
[0595] Following the above operation, the encoded 
video stream SSav is packetized as shown in row 2 in 
Fig. 46, and the encoded audio stream SSaa is pack- 
etized as shown in row 4. As shown in row 5, the video 
packets V and audio packets A are then interleaved 
(multiplexed) to maintain an average distribution of 
audio packets to video packets in the FIFO sequence 
described above referenced to the packet input times to 
the buffers 2600 and 2800. After packing and converting 
to a system stream, the data is then recorded to the 
optical disk. 

[0596] In the same manner the encoded video stream 
SSbv is packetized as shown in row 2 in Fig. 46, and the 
encoded audio stream SSba is packetized as shown in 
row 4. As shown in row 5. the video packets V and audio 
packets A are then interleaved (multiplexed) to maintain 
an average distribution of audio packets to video pack- 
ets in the FIFO sequence described above referenced 
to the packet input times to the buffers 2600 and 2800. 
After packing and converting to a system stream, the 
data is then recorded to the optical disk. 
[0597] The resulting system streams SSa and SSb 
are thus structured as shown in Fig. 44, enabling seam- 
less reproduction by the DVD decoder DCD shown in 
Fig. 26. 

[0598] Because two audio frames can be accumu- 
lated in the audio buffer, the last audio frame in the sys- 
tem stream SSa stored in the audio buffer at time T1 is 
transferred as the last audio packet in system stream 
SSa during the two-frame reproduction time before 
decoding said last audio frame begins. The maximum 
input end time difference between the video packets 
and audio packets at the end of the system stream SSa 
is therefore the reproduction time of two audio frames. 
[0599] Furthermore, the audio buffer will not underflow 
if the next audio data is input to the audio buffer before 
the presentation end time of the audio frames accumu- 



lated in the audio buffer as of time T2. The input time of 
the first audio packet in system stream SSb is therefore 
at latest within the reproduction time of two audio 
frames after time T2 (= the presentation time of the 
s accumulated audio frames + the reproduction time of 
one audio frame). Therefore the maximum input start 
time difference between the video packets and audio 
packets at the beginning of system stream SSb is the 
reproduction time of two audio frames. 
10 [0600] A second method of producing the system 
stream recorded to an optical disk according to the 
present embodiment is described next below with refer- 
ence to Fig. 47. The first, second, third, fourth, and fifth 
rows in Fig. 47 simulate the video and audio data 
is input/output states to the respective buffers referenced 
to the same time-base T as shown in Fig. 44. 
[0601 ] The first row in Fig. 47 corresponds to block Ga 
in Fig. 44, and simulates the data input/output of video 
stream SSav and video stream SSbv to the video buffer 
20 2600. 

[0602] The second row similarly corresponds to block 
Gb in Fig. 44, and shows the video data packet stream. 
[0603] The third row similarly corresponds to block Gc 
in Fig. 44, and simulates the data input/output of audio 
25 stream SSaa and audio stream SSba to the audio buffer 
2800. 

[0604] The fourth row similarly corresponds to block 
Gd in Fig. 44, and shows the audio data packet stream. 
[0605] The fifth row similarly corresponds to block Ge 

30 in Fig. 44, and shows the system stream resulting from 
interleaving and packing the video packets V shown in 
the second row and the audio packets A shown in the 
fourth row. The video packets and audio packets are 
interleaved in a FIFO manner from the video and audio 

35 buffers referenced to the packet input time to the 
respective buffer. In other words, the packed data is 
multiplexed referenced to the time the data contained in 
the pack is input to the video or audio buffer. The first 
common system stream SSa and the second common 

40 system stream SSb following thereafter can be pro- 
duced using the first method described above with ref- 
erence to Fig. 46. 

[0606] A different method for generating the first com- 
mon system stream SSa and the second common sys- 

45 tern stream SSb following thereafter, i.e., a method 
different from that described with reference to Fig. 46, is 
described below with reference to Fig. 47. 
[0607] In the first method described above, part of the 
encoded audio stream from the preceding system 

so stream is moved to the following system stream. This 
second method, however, is characterized by moving 
part of the encoded video and audio streams from the 
following system stream. This second method is partic- 
ularly effective when the preceding scene (system 

55 stream) is a scene from a multi-scene period, i.e., when 
moving from one of plural scenes (system streams) to 
the encoded system stream of a single scene is 
extremely difficult. 
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[0608] With this method the first GOP in video stream 
SSbv is moved to video stream SSav. The one GOP 
moved from video stream SSbv is connected to video 
stream SSav to assure time-base contiguity at the end 
of video stream SSav. At the second GOP from the 5 
beginning of video stream SSbv, i.e., the second GOP 
counted from the beginning of video stream SSbv 
including the first GOP already moved, referenced to 
the input start time T2 of the data decoded first, the 
audio data of one audio frame is moved to the audio to 
stream SSaa to accumulate one audio frame in the 
audio buffer. 

[0609] The one audio frame of data moved from audio 
stream SSba is then connected to audio stream SSaa to 
assure time-base contiguity at the end of audio stream 15 
SSaa. 

[061 0] The video data is moved in GOP units because 
the GOP is the unit of video data reproduction. Audio 
data is likewise moved in audio frame units because the 
audio frame is the unit of audio frame reproduction. 20 
[0611] Following the above operation, the encoded 
video stream SSav is packetized as shown in row 2 in 
Fig. 47, and the encoded audio stream SSaa is pack- 
etized as shown in row 4. As shown in row 5, the video 
packets V and audio packets A are then interleaved 25 
(multiplexed) to maintain an average distribution of 
audio packets to video packets in the FIFO sequence 
described above referenced to the packet input times to 
the buffers 2600 and 2800. After packing and converting 
to a system stream, the data is then recorded to the 30 
optical disk. 

[061 2] In the same manner the encoded video stream 
SSbv is packetized as shown in row 2 in Fig. 47, and the 
encoded audio stream SSba is packetized as shown in 
row 4. As shown in row 5. the video packets V and audio 35 
packets A are then interleaved (multiplexed) to maintain 
an average distribution of audio packets to video pack- 
ets in the FIFO sequence described above referenced 
to the packet input times to the buffers 2600 and 2800. 
After packing and converting to a system stream, the 40 
data is then recorded to the optical disk. 
[0613] The resulting system streams SSa and SSb 
are thus structured as shown in Fig. 39, enabling seam- 
less reproduction by the DVD decoder DCD shown in 
Fig. 26. 45 
[0614] Because two audio frames can be accumu- 
lated in the audio buffer, the last audio frame in the sys- 
tem stream SSa stored in the audio buffer at time T1 is 
transferred as the last audio packet in system stream 
SSa during the two-frame reproduction time before so 
decoding said last audio frame begins. The maximum 
input end time difference between the video packets 
and audio packets at the end of the system stream SSa 
is therefore the reproduction time of two audio frames. 
[061 5] Furthermore, the audio buffer will not underflow 55 
if the next audio data is input to the audio buffer before 
the presentation end time of the audio frames accumu- 
lated in the audio buffer as of time T2. The input time of 
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the first audio packet in system stream SSb is therefore 
at latest within the reproduction time of two audio 
frames after time T2 (= the presentation time of the 
accumulated audio frames + the reproduction time of 
one audio frame). Therefore the maximum input start 
time difference between the video packets and audio 
packets at the beginning of system stream SSb is the 
reproduction time of two audio frames. 
[061 6] The next embodiment relates to connecting the 
system stream branches obtained by means of the sys- 
tem encoder according to the preferred embodiment of 
the present invention. 

[0617] The physical structure of the optical disk, the 
overall data structure of the optical disk, and the DVD 
decoder DCD in the present embodiment are as 
described above, and further description thereof is thus 
omitted below. 

[061 8] Note that the description of the present embod- 
iment below is limited to a single encoded video stream 
and a single encoded audio stream for simplicity. 
[0619] Fig. 48 shows the structure of the end of the 
second common system stream SSb, and the begin- 
nings of the two parental lock control system streams 
SSc and SSd that can be connected to the end of com- 
mon system stream SSb. Note that the common system 
stream SSb and one of the two parental lock control 
system streams SSc and SSd are arrayed to the same 
time-base (horizontal time axis T) as shown in Fig. 46. 
[0620] System streams SSb, SSc, and SSd shown as 
separate blocks in Fig. 48 represent the following con- 
tent as in Fig. 46. 

[0621 ] The fifth row in each block shows the structure 
of system streams SSb, SSc, and SSd. System stream 
SSb comprises video stream SSbv and audio stream 
SSba; system stream SSc similarly comprises video 
stream SScv and audio stream SSca; and system 
stream SSd similarly comprises video stream SSdv and 
audio stream SSda. 

[0622] The fourth rows show the audio packet streams 
A of audio stream SSba. audio stream SSca and audio 
stream SSda extracted from system streams SSb, SSc, 
and SSd. 

[0623] The third rows show the data input/output state 
of the audio buffer 2800 when audio stream SSba, 
audio stream SSca and audio stream SSda are input to 
a DVD decoder DCD shown in Fig. 26. 
[0624] The second rows show the video packet 
streams V of video stream SSbv, video stream SScv, 
and video stream SSdv extracted from system streams 
SSb, SSc, and SSd. 

[0625] The first rows show the data input/output state 
of the video buffer 2600 when video stream SSbv, video 
stream SScv, and video stream SSdv are input to a DVD 
decoder DCD. 

[0626] The audio content of the first several audio 
frames in audio stream SSca and audio stream SSda at 
the beginning of system stream SSc and system stream 
SSd is the same. 
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[0627] When system stream SSb is input to the DVD 
decoder DCD, the difference between the input end 
times of the video stream SSbv and audio stream SSba 
to the respective buffers 2600 and 2800 is small, and at 
most is less than the reproduction time of two audio 
frames. 

[0628] When system stream SSc is input to the DVD 
decoder DCD, the difference between the input end 
times of the video stream SScv and audio stream SSca 
to the respective buffers 2600 and 2800 is small, and at 
most is less than the reproduction time of two audio 
frames. 

[0629] When system stream SSd is input to the DVD 
decoder DCD, the difference between the input end 
times of the video stream SSdv and audio stream SSda 
to the respective buffers 2600 and 2800 is small, and at 
most is less than the reproduction time of two audio 
frames. 

[0630] The data input/output state of the video buffer 
2600 when system stream SSb is connected to and 
contiguously reproduced with system stream SSc or 
system stream SSd is the same as shown in Fig. 44. 
Specifically, system stream SSa in Fig. 44 corresponds 
to system stream SSb in Fig. 48, and system stream 
SSb in Fig. 44 corresponds to either system stream SSc 
or system stream SSd in Fig. 48. 
[0631] When system stream SSb and system stream 
SSd or system stream SSc in Fig. 48 are contiguously 
reproduced using the DVD decoder DCD shown in Fig. 
26, the video buffer also does not overflow as described 
above with reference to Fig. 44. As a result, seamless 
reproduction can be achieved when system stream SSb 
is connected and contiguously reproduced with system 
stream SSc or system stream SSd. 
[0632] Note also that system streams SSb. SSc, and 
SSd are produced using the method described with ref- 
erence to Fig. 46. 

[0633] The data structure of system streams SSb, 
SSc, and SSd produced according to the method shown 
in Fig. 46 is shown in Fig. 48, and seamless reproduc- 
tion can therefore be achieved using the DVD decoder 
DCD shown in Fig. 26 

[0634] As described with reference to audio frame 
movement Fig. 46, the maximum input end time differ- 
ence between the video packets and audio packets at 
the end of the system stream SSb is at most the repro- 
duction time of two audio frames, and the maximum 
input start time difference between the video packets 
and audio packets at the beginning of system stream 
SSc or SSd is at most the reproduction time of two 
audio frames. 

[0635] When the audio frame moved from audio 
stream SSba is connected to destination audio streams 
SSca and SSda, an audio reproduction stop, i.e., an 
audio reproduction gap, is provided when making the 
connection. As a result, the differences in the video 
reproduction time and the audio reproduction time of 
each reproduction path can be corrected based on the 



reproduction gap information in the system streams not 
shared between different program chains PGC. As a 
result, this video and audio reproduction time difference 
can be prevented from affecting the process connecting 
s preceding and following system streams. 

[0636] Fig. 49 is used to describe the difference in the 
video reproduction time and audio reproduction time of 
different reproduction paths according to the present 
embodiment. In Fig. 49, time Tb represents the time drf- 
10 ference between the audio and video reproduction end 
times at the end of the system stream common to the 
adult-oriented title and the child-oriented title before 
moving the audio data; time Tc is the time difference 
between the audio and video reproduction start times at 
is the beginning of the adult-oriented title before audio 
data movement; and time Td is the time difference 
between the audio and video reproduction start times at 
the beginning of the child-oriented title before audio 
data movement. 
20 [0637] It is possible to match the time difference 
between the audio and video reproduction start times of 
at least one of the plural different reproduction paths fol- 
lowing the branch to the time difference of the audio and 
video reproduction end times before the branch. Note 
25 that it is assumed in the following description that 
Tb = Tc , and Tb < Td. 

[0638] Because Tb = Tc in the adult-oriented title 
after the branch, the audio frame moved from the com- 
mon part of the adult-oriented and child-oriented title 

30 streams can be connected to the beginning of the adult- 
oriented title without an audio reproduction gap. 
[0639] To enable seamless reproduction between sys- 
tem stream SSb and system stream SSc at the connec- 
tion, the system streams are generated using the first 

35 system stream encoding method described above with 
reference to moving audio data from one system stream 
SSb to another system stream SSc. 
[0640] The method of producing the system streams 
is the same as that described above with reference to 

40 Fig. 46 except that system streams SSa and SSb are 
replaced by system streams SSb and SSc in Fig. 49, 
and further description thereof is thus omitted below. 
[0641 ] Because Tb < Td in the child-oriented title after 
the branch, the audio frame moved from the common 

45 part of the adult-oriented and child-oriented title 
streams can be connected to the beginning of the child- 
oriented title stream with an audio reproduction gap of 
only Td - Tb. 

[0642] To enable seamless reproduction between sys- 
50 tern stream SSb and system stream SSd at the connec- 
tion, the system streams are generated using the first 
system stream encoding method described above with 
reference to moving audio data from one system stream 
SSb to another system stream SSd. 
>5 [0643] The method of producing the system streams 
is the same as that described above with reference to 
Fig. 46 except that system streams SSa and SSb are 
replaced by system streams SSb and SSd in Fig. 49, 
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and further description thereof is thus omitted below. 
[0644] Note that packetizing in this case is controlled 
so that the audio frames before and after the audio 
reproduction gap are not included in the same packet 
As a result, it is possible to write the audio playback 5 
starting time information APTS (the audio frame repro- 
duction start time including the audio reproduction 
pause time) of the audio frames before and after the 
audio reproduction gap into the system stream. 
[0645] The packet containing the audio frame immedi- 10 
ately preceding the audio reproduction gap is of neces- 
sity small. During the packing process a padding packet 
is therefore used to produce a fixed-length pack of 2048 
bytes/pack. 

[0646] The audio reproduction gap information for the is 
audio reproduction gap of system stream SSd in this 
embodiment is inserted to the system stream by writing 
the audio frame reproduction end time immediately 
before the audio reproduction gap of the child-oriented 
title to the audio reproduction stopping time 1 20 
(VOB_A_STP_PTM1) in the navigation pack NV (Fig. 
20), and writing the audio reproduction gap time Td - Tb 
to the audio reproduction stopping period 1 
(VOB_A_GAPJ_EN1) in the DSI packet. 
[0647] When there is no audio reproduction gap, it is 2s 
possible to determine that there is no audio reproduc- 
tion gap by writing a 0 value to the audio reproduction 
stopping time 1 (VOB_A_STP_PTM1). 
[0648] By means of the above process, it is possible 
to set the time difference between the different audio 30 
and video reproduction times of different reproduction 
paths to the audio reproduction gap of the system 
streams not shared by different program chains PGC. 
[0649] In addition, by writing information relating to the 
audio reproduction gap to the reproduction control infor- 35 
mation contained in the navigation packs NV, the audio 
reproduction gap and the information relating to the 
audio reproduction gap can all be contained within a 
single system stream. 

[0650] Furthermore, by containing the audio repro- 40 
duction gap and the information relating to the audio 
reproduction gap within a single system stream, it is 
possible to move the audio reproduction gap within the 
system stream. It is therefore possible to move the 
audio reproduction gap to a silent space or other place 45 
where least audibly perceptible, and thereby achieve a 
more seamless reproduction. 

[0651] The internal structure of the system encoder 
900 in the DVD encoder ECD shown in Fig. 25 is shown 
in detail in the block diagram in Fig. 50. Note that the so 
system encoder 900 generates the system streams. 
[0652] As shown in Fig. 50, the system encoder 900 
comprises an elementary stream buffer 3301 for tempo- 
rarily storing the video, sub-picture, and audio data; a 
video analyzer 3302 for simulating the video buffer 55 
state; a sub-picture analyzer 3308 for simulating the 
sub-picture buffer state; an audio analyzer 3303 for sim- 
ulating the audio buffer state; a movement calculator 



3304 for calculating the number of audio frames to 
move; a packet producer 3305 for packetizing the video 
data, audio data, and sub-picture data; a multiplexer 
3306 for determining the packet arrangement; and a 
pack producer 3307 for packing the packets to produce 
the system stream. 

[0653] The elementary stream buffer 3301 is con- 
nected to the video stream buffer 400, sub-picture 
stream buffer 600, and audio stream buffer 800 shown 
in Fig. 26, and temporarily stores the elementary 
streams. The elementary stream buffer 3301 is also 
connected to the packet producer 3305. 
[0654] The video analyzer 3302 is connected to the 
video stream buffer 400, thus receives the encoded 
video stream St27, simulates the video buffer state, and 
supplies the simulation result to the movement calcula- 
tor 3304 and multiplexer 3306. 

[0655] The audio analyzer 3303 is likewise connected 
to the audio stream buffer 800, thus receives the 
encoded audio stream St31, simulates the audio buffer 
state, and supplies the simulation result to the move- 
ment calculator 3304 and multiplexer 3306. 
[0656] The sub-picture analyzer 3308 is likewise con- 
nected to the sub-picture stream buffer 600, thus 
receives the encoded sub-picture stream St29, simu- 
lates the sub-picture buffer state, and supplies the sim- 
ulation result to the movement calculator 3304 and 
multiplexer 3306. 

[0657] Based on the simulated buffer states, the 
movement calculator 3304 calculates the audio move- 
ment (number of audio frames) and the audio reproduc- 
tion gap information, and supplies the calculation 
results to the packet producer 3305 and multiplexer 
3306. More specifically, the movement calculator 3304 
calculates the audio data movement MFApl from the 
preceding scene, the audio data movement MFAp2 to 
the preceding scene, the movement MGVp of 1 GOP of 
video data to the preceding scene, the movement MGVf 
of 1 GOP of video data from the following scene, the 
movement MFAfl of audio data to the following scene, 
and the movement MFAf2 of audio data from the follow- 
ing scene. 

[0658] The packet producer 3305 produces the video, 
sub-picture, and audio packets from the video data, 
sub-picture data, and audio data stored in the elemen- 
tary stream buffer 3301 according to the audio move- 
ment calculated by the movement calculator 3304. The 
packet producer 3305 also produces the reproduction 
control information, i.e., the navigation packs NV. The 
audio reproduction gap information is also written to the 
navigation packs NV at this time. 
[0659] Based on the audio reproduction gap informa- 
tion and the video and audio buffer state information 
simulated by the video analyzer 3302 and audio ana- 
lyzer 3303, the multiplexer 3306 rearranges, i.e., multi- 
plexes, the video packets, audio packets, and 
navigation packs NV. The movement calculator 3304 
also performs based on the audio reproduction gap 
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information. 

[0660] The pack producer 3307 then packs the pack- 
ets, adds the system header, and produces the system 
stream. 

[0661 ] Note that the operation of the system encoder s 
900 is described in detail below with reference to Fig. 
53. 

[0662] The present embodiment relates to connecting 
system streams by coupling. The next embodiment 
relates to connecting system streams at the trailing end 10 
of a multi-scene period, i.e., connecting one of plural 
preceding system streams to the common system 
stream following thereafter. 

[0663] The physical structure of the optical disk, the 
overall data structure of the optical disk, and the DVD is 
decoder DCD in the present embodiment are as 
described above, and further description thereof is thus 
omitted below. 

[0664] Note that the description of the present embod- 
iment below is limited to a single encoded video stream 20 
and a single encoded audio stream for simplicity. 
[0665] Fig. 51 shows the structure of the end of the 
two parental lock control system streams SSc and SSd, 
and the beginning of the following common system 
stream SSe that can be connected to either of the pre- 25 
ceding parental lock control system streams SSc and 
SSd. Note that this figure is basically the same as Fig. 
48 in which the parental lock control system streams are 
the following system streams. 

[0666] Note that the one of the two parental lock con- 30 
trol system streams SSc and SSd and the common sys- 
tem stream SSe are arrayed to the same time-base 
(horizontal time axis T) as shown in Fig. 51. 
[0667] System streams SSc, SSd, and SSe shown as 
separate blocks in Fig. 51 represent the following con- 35 
tent as in Fig. 46. 

[0668] The fifth row in each block shows the structure 
of system streams SSc, SSd, and SSe. System stream 
SSc comprises video stream SScv and audio stream 
SSca; system stream SSd similarly comprises video 40 
stream SSdv and audio stream SSda; and system 
stream SSe comprises video stream SSev and audio 
stream SSea. 

[0669] The fourth rows show the audio packet streams 
A of audio stream SSca, audio stream SSda, and audio as 
stream SSea, extracted from system streams SSc, SSd, 
and SSe. 

[0670] The third rows show the data input/output state 
of the audio buffer 2800 when audio stream SSca, audio 
stream SSda, and audio stream SSea. are input to the so 
DVD decoder DCD. 

[0671] The second rows show the video packet 
streams V of video stream SScv, video stream SSdv, 
and video stream SSev extracted from system streams 
SSc. SSd, and SSe. 55 
[0672] The first rows show the data input/output state 
of the video buffer 2600 when video stream SScv, video 
stream SSdv. and video stream SSev are input to the 



DVD decoder DCD. 

[0673] At the end of system streams SSc and SSd, the 
video content of at least the last GOP in each video 
stream SSdv and SSev is the same. 
[0674] Likewise, the audio content of the last several 
audio frames in audio streams SSca and SSda at the 
end of system streams SSc and SSd is the same. 
[0675] When system stream SSc is input to the DVD 
decoder DCD, the difference between the input end 
times of the video stream SScv and audio stream SSca 
to the respective buffers 2600 and 2800 is small, and at 
most is less than the reproduction time of two audio 
frames. 

[0676] When system stream SSd is input to the DVD 
decoder DCD, the difference between the input end 
times of the video stream SSdv and audio stream SSda 
to the respective buffers 2600 and 2800 is small, and at 
most is less than the reproduction time of two audio 
frames. 

[0677] When system stream SSe is input to the DVD 
decoder DCD. the difference between the input end 
times of the video stream SSev and audio stream SSea 
to the respective buffers 2600 and 2800 is small, and at 
most is less than the reproduction time of two audio 
frames. 

[0678] The data input/output state of the video buffer 
2600 when system stream SSc or system stream SSd is 
connected to and contiguously reproduced with system 
stream SSe is the same as shown in Fig. 44. Specifi- 
cally, system stream SSa in Fig. 44 corresponds to 
either system stream SSc or system stream SSd in Fig. 
51, and system stream SSb in Fig. 44 corresponds to 
system stream SSe in Fig. 51. 

[0679] Seamless reproduction can thus be achieved 
when system stream SSc or system stream SSd is con- 
nected and contiguously reproduced with system 
stream SSe. 

[0680] Note also that system streams SSc, SSd, and 
SSe are produced using the second method described 
above with reference to Fig. 47. More specifically, the 
system streams can be similarly created by substituting 
system streams SSc and SSd in Fig. 51 for system 
stream SSa in Fig. 47, and substituting system stream 
SSe in Fig. 51 for system stream SSb in Fig. 47. The 
method of generating the system streams is as 
described above with reference to Fig. 47. 
[0681] The data structure of system streams SSc, 
SSd, and SSe produced according to the method 
shown in Fig 47 is shown in Fig. 51, and seamless 
reproduction can therefore be achieved using the DVD 
decoder DCD shown in Fig. 26. 
[0682] As described with reference to audio frame 
movement Fig. 46, the maximum input end time differ- 
ence between the video packets and audio packets at 
the end of system streams SSc and SSd is at most the 
reproduction time of two audio frames, and the maxi- 
mum input start time difference between the video pack- 
ets and audio packets at the beginning of system 
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stream SSe is at most the reproduction time of two 
audio frames. 

[0683] By providing an audio reproduction stop, i.e., 
an audio reproduction gap, when moving and connect- 
ing audio frames from audio stream SSea to destination s 
audio streams SSca and SSda, the differences in the 
video reproduction time and the audio reproduction time 
of each reproduction path can be contained within the 
system streams not shared between different program 
chains PGC. 10 
[0684] Fig. 52 is used to describe the difference in the 
video reproduction time and audio reproduction time of 
different reproduction paths according to the present 
embodiment. In Fig. 52, time Te represents the time dif- 
ference between the audio and video reproduction start is 
times at the beginning of the adult -oriented title before 
audio data movement; time Tc* is the time difference 
between the audio and video reproduction end time at 
the end of the adult-oriented title stream; and time Td f is 
the time difference between the audio and video repro- 20 
duction end times at the end of the child-oriented title 
stream before moving the audio data. 
[0685] It is possible to match the time difference 
between the audio and video reproduction end times of 
at least one of the plural different reproduction paths 25 
before the connection with the time difference of the 
audio and video reproduction start times following the 
connection. Note that it is assumed in the following 
description that Te = Tc' , and Te < Td\ 
[0686] Because Te = Tc' in the adult-oriented title 30 
before the connection, the audio frame moved from the 
common part of the adult-oriented and child-oriented 
title streams can be connected to the end of the child- 
oriented title stream without an audio reproduction gap. 
A seamless stream is then produced after the connec- 35 
tion as shown in the figure. 

[0687] Because Td' < Te in the child-oriented title 
stream before the connection, the audio frame moved 
from the common part of the adult-oriented and chikJ- 
oriented title streams can be connected to the end of 40 
the child-oriented title stream with an audio reproduc- 
tion gap of only Te - Td'. 

[0688] To enable seamless reproduction between sys- 
tem stream SSc and system stream SSd at the connec- 
tion with system stream SSe, the system streams are 4$ 
generated using the second system stream encoding 
method described above with reference to moving the 
encoded video stream and audio data from one system 
stream SSe to another system stream SSc and SSd. 
[0689] The method of producing the system streams so 
is the same as that described above with reference to 
Fig. 47 except that system streams SSc and SSd in Fig. 
51 are substituted for system stream SSa in Fig. 47, and 
system stream SSe in Fig. 51 is substituted for system 
stream SSb in Fig. 47, and further description thereof is ss 
thus omitted below. 

[0690] When producing these system streams, the 
packets are generated so that the audio frames before 



and after the audio reproduction gap are not contained 
in the same packet. As a result, it is possible to write the 
audio playback starting time information APTS (the 
audio frame reproduction start time including the audio 
reproduction pause time) of the audio frames before 
and after the audio reproduction gap into the system 
stream. 

[0691] The packet containing the audio frame immedi- 
ately preceding the audio reproduction gap is of neces- 
sity small. During the packing process a padding packet 
is therefore used to produce a fixed-length pack of 2048 
bytes/pack. 

[0692] The audio reproduction gap information for the 
audio reproduction gap of system stream SSd in this 
embodiment is inserted to the system stream by writing 
the audio frame reproduction end time immediately 
before the audio reproduction gap of the child-oriented 
title to the audio reproduction stopping time 2 
(VOB_A_STP_PTM2) in the navigation pack NV (Fig. 
20), and writing the audio reproduction gap time Te - Td' 
to the audio reproduction stopping period 2 
(VOB_A_GAP_LEN2) in the DSI packet. 
[0693] When there is no audio reproduction gap, it is 
possible to determine that there is no audio reproduc- 
tion gap by writing a 0 value to the audio reproduction 
stopping time 2 (VOB_A_STP_PTM2). 
[0694] By means of the above process, it is possible 
to set the time difference between the different audio 
and video reproduction times of different reproduction 
paths to the audio reproduction gap of the system 
streams not shared by different program chains PGC. 
[0695] In addition, by writing information relating to the 
audio reproduction gap to the reproduction control infor- 
mation contained in the navigation packs NV, the audio 
reproduction gap and the information relating to the 
audio reproduction gap can all be contained within a 
single system stream. 

[0696] Furthermore, by containing the audio repro- 
duction gap and the information relating to the audio 
reproduction gap within a single system stream, it is 
possible to move the audio reproduction gap within the 
system stream. It is therefore possible to move the 
audio reproduction gap to a silent space or other place 
where least audibly perceptible, achieve seamless data 
reproduction not permitting the audio buffer to under- 
flow, and thereby achieve seamless reproduction of the 
audio information that is important for human perception 
of data contiguity. 

[0697] The system streams described above can be 
produced using the system encoder 900 of the DVD 
encoder ECD shown in Fig. 25. The structure of the sys- 
tem encoder 900 is as described above with reference 
to Fig. 50, and further description thereof is thus omitted 
below. 

[0698] The process of producing the above described 
system streams is described below with reference to 
Fig. 53. Note that this process is the system encoding 
subroutine shown as step #2200 of the system encoder 
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flow chart shown in Fig. 34. 
System encoder flow chart 

[0699] The system encoding process is described s 
below with reference to Fig. 53. 
[0700] At step #307002 the conditions for connecting 
with the preceding scene are evaluated based on the 
state of the preceding VOB seamless connection flag 
VOB_Fsb. If a non-seamless connection with the pre- 10 
ceding scene is specified, i.e., VOB__Fsb _ 1 , the proce- 
dure moves to step #307010. 

[0701 ] At step #3070 1 0 the movement calculator 3304 
(Fig. 50) sets the audio data movement MFApl from the 
preceding scene, i.e., the number of audio frames is 
moved, to 0 based on the VOB_Fsb _ 1 declaration. The 
procedure then moves to step #307014. 
[0702] If a seamless connection with the preceding 
scene is specified, i.e., VOB_Fsb = 1 , at step #307002, 
the procedure moves to step #307004. 20 
[0703] At step #307004 ft is determined whether the 
preceding scene is in a multi-scene period. If it is not, 
the procedure moves to step #307012; if it is, the proce- 
dure moves to step #307006. 

[0704] At step #307012 the audio data movement 25 
MFApl from the preceding scene is calculated, and the 
procedure moves to step #307014. Note that the 
method of calculating the audio data movement MFApl 
is described after this process with reference to Fig. 54 
below. 30 
[0705] At step #307006 the movement MGVp of 1 
GOP of video data to the preceding scene is calculated, 
and the procedure moves to step #307008. If the pre- 
ceding scene is in a multi-scene period, it is not possible 
to uniformly calculate the audio data movement MFApl 35 
as in step #307012. As a result, the movement of one 
GOP of video data from the beginning of the present 
scene to the preceding scene is calculated. 
[0706] At step #307008 the audio data movement 
MFAp2 to the following scene is calculated, and the pro- 40 
cedure moves to step #307014. Note that the method of 
calculating the audio data movement MFAp2 is 
described after this process with reference to Fig. 55 
below. 

[0707] At step #307014 the conditions for connecting 45 
with the following scene are evaJuated based on the 
state of the following VOB seamless connection flag 
VOB_Fsf. If a non-seamless connection with the follow- 
ing scene is specified, i.e., VOB_Fsf _ 1, the procedure 
moves to step #307022. If a seamless connection with so 
the following scene is specified, i.e., VOB_Fsf = 1, the 
procedure moves to step #307016. 
[0708] At step #307022 the movement calculator 3304 
(Fig. 50) sets the audio data movement MFApl to the 
following scene to 0 based on the VOB_Fsb _ 1 decla- 55 
ration. The procedure then moves to step #307026. 
[0709] At step #307016 it is determined whether the 
following scene is in a multi-scene period based on the 



multi-scene flag VOB_Fp. If it is not, i.e., VOB_Fp - 1, 
the procedure moves to step #307024; if it is, i.e., 
VOB_Fp = 1 , the procedure moves to step #307018. 
[0710] At step #307024 the audio data movement 
MFApl , MFAp2 to the following scene is calculated, and 
the procedure moves to step #307026. Note that the 
method of calculating the audio data movement MFAp2 
is the same as that used in step #307012. 
[0711] At step #307018 the movement MGVf of 1 
GOP of video data from the following scene is calcu- 
lated, and the procedure moves to step #307020. 
[0712] At step #307020 the audio data movement 
MFAp2 from the following scene is calculated, and the 
procedure moves to step #307026. Note that the 
method of calculating the audio data movement MFAf2 
is the same as that used in step #307008. 
[0713] At step #307026 the audio reproduction stop- 
ping time 1 (VOB_A_STP_PTM1) and the audio repro- 
duction stopping period 1 (VOB_A_GAP_LEN1) are 
calculated from the audio and video end times of the 
preceding scene. The procedure then moves to step 
#307028 

[0714] At step #307028 the audio reproduction stop- 
ping time 2 (VOB_A__STP_PTM2) and the audio repro- 
duction stopping period 2 (VOB_A_GAP_LEN2) are 
calculated from the audio and video start times in the 
following scene. The procedure then moves to step 
#307030. 

[071 5] At step #307030 the audio data, including the 
audio movement, is packetized, and the procedure 
moves to step #307032. 

[0716] At step #307032 the video data, including the 
video movement, is packetized, and the procedure 
moves to step #307034. 

[0717] At step #307034 the navigation pack NV is 
generated, the audio reproduction stopping time 1 
(VOB_A_STP_PTM1) and the audio reproduction stop- 
ping period 1 (VOB_A_GAP_LEN1), and the audio 
reproduction stopping time 2 (VOB_A_STP_PTM2) and 
the audio reproduction stopping period 2 
(VOB_A_GAP_LEN2) are recorded, and the procedure 
moves to step #307036. 

[0718] At step #307036 the video packets V, audio 
packets A, and navigation pack NV are multiplexed. 
[071 9] As described above, it is thus possible to move 
audio and video data between scenes according to the 
conditions for connections with the preceding and fol- 
lowing scenes, and generate the system stream accord- 
ingly. 

[0720] The method of calculating the audio data 
movement MFApl in step #307012 above is described 
below with reference to Fig. 54. 
[0721] In Fig. 54 video 1 is the video data at the end 
of the preceding scene, with the video 1 line represent- 
ing the change in video data accumulation at the end of 
the preceding scene in the video buffer 2600 of the DVD 
decoder DCD; video 2 is similarly the video data at the 
beginning of said scene with the video 2 line represerrt- 
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ing the change in said video data accumulation in the 
video buffer 2600 at the beginning of said scene. 
[0722] Note that both video 1 and video 2 represent 
the state of the video buffer before system stream con- 
nection. VDTS is the time video 2 is first decoded; tv is 
the video 2 transfer start time, and is calculated from 
equation 30631 below where video buffer verifier delay 
vbv_delay is defined as the time from the start of data 
input to the video buffer to the start of decoding. If 
decoding starts at vbv_delay after the start of data input 
to the video buffer, a video buffer data underflow state 
can be reliably prevented during the following decoding 
process. 

tv = VDTS - vbv_delay [1] 

[0723] Audio 1 shows the transfer of audio frames at 
the end of the preceding scene to the audio buffer 
where af1, af2, af3, and af4 are the audio frames con- 
tained in audio 1. Note that the audio frame is the 
encoding process unit, and contains the audio data for a 
defined period of time (Af). 

[0724] Audio 2 shows the transfer of audio frames at 
the beginning of the scene to the audio buffer where af5 
and af6 are the audio frames contained in audio 2. 
[0725] APTS is the time the audio in audio 2 is first 
reproduced. 

[0726] The audio frames (af3, af4) transferred during 
period APTS from time tv, i.e., the number of audio 
frames (Amove) MFApl attached to audio 1 transferred 
after the start of video 2 transfer, is calculated according 
to equation 2. 

Amove = (APTS - tv - Af) / Af [2] 

[0727] The audio data movement (number of audio 
frames) from the preceding scene is thus calculated. 
[0728] The method of calculating the audio data 
movement MFAp2 to the following scene in step 
#307008 above is described below with reference to 
Fig. 55. 

[0729] As in Fig. 54. video 1 is the video data at the 
end of the preceding scene, and video 2 is similarly the 
video data at the beginning of said scene. Note that 
both video 1 and video 2 show the video buffer state 
before scene connection. VDTS is the time video 2 is 
first decoded; GOP_move is the one GOP video data 
GMVp moved in step #307006; tv is the time video 2 
transfer starts after moving the GOP_move quantity of 
GOP. and can be uniformly calculated. 
[0730] Audio 1 shows the transfer of audio frames at 
the end of the preceding scene to the audio buffer 
where af 1 , af2, af3, and af4 are the audio frames con- 
tained in audio 1. Note that the audio frame is the 
encoding process unit, and contains the audio data for a 
defined period of time (Af). 

[0731] Audio 2 shows the transfer of audio frames at 
the beginning of the scene to the audio buffer where 



af5, af6, and af7 are the audio frames contained in 
audio 2. 

[0732] APTS is the time the audio in audio 2 is first 
reproduced. 

5 [0733] The audio frames (af5, af6. af7) transferred 
during period APTS from time tv, i.e., the number of 
audio frames (Amove) MFApl attached to audio 2 trans- 
ferred before the start of video 2 transfer after moving 
GOP_move quantity of GOP. is calculated according to 

10 equation 3. 

Amove = (APTS - tv + 2Af) / Af [3] 

[0734] The audio data movement (number of audio 
is frames) to the preceding scene is thus calculated. 

Audio oap reproduction processing 

[0735] While the basic structure of the DVD decoder 
20 DCD used in the present embodiment is as shown in 

Fig. 26, the synchronizer 2900 is structured as shown in 

Fig. 56 to process the audio reproduction gap. 

[0736] As shown in Fig. 56, a block diagram of the 

synchronizer 2900 shown in Fig. 26, the synchronizer 
25 2900 comprises an STC generator 2950. audio decoder 

controller 2952, and audio decoder control data buffer 

2954. 

[0737] The STC generator 2950 generates the system 
clock STC used as the reference clock for decoding 

30 control based on the system clock reference SCR value 
set by the decoding system controller 2300. 
[0738] The audio decoder controller 2952 controls the 
decoding start and stop of the audio decoder 3200 
based on the STC value from the STC generator 2950 

35 and the control information from the audio decoder con- 
trol data buffer 2954. 

[0739] The audio decoder control data buffer 2954 
stores the values of the audio decoding control informa- 
tion (such as VOB_A_STP_PTM and 
40 VOB_A_GAP_LEN) set by the decoding system con- 
troller 2300. 

[0740] The operation of the synchronizer 2900 thus 
comprised according to the present embodiment is 
described below with reference to Fig. 26 and Fig. 56. 

45 [0741 ] The overall operation of the DVD decoder DCD 
in Fig. 26 is as previously described, and further 
description thereof is thus omitted below. The operation 
related to the specific processes of the present embod- 
iment is described below. 

so [0742] Referring to Fig. 26. the decoding system con- 
troller 2300 reads the audio reproduction stopping time 
1 (VOB_A_STP_PTM1), the audio reproduction stop- 
ping period 1 (VOB_A_GAP_LEN1), the audio repro- 
duction stopping time 2 (VOB_A_STP_PTM2), and the 

55 audio reproduction stopping period 2 
(VOB_A_GAP_LEN2) from the DSI packet in the navi- 
gation pack NV, and stores these four values as the 
audio decode reproduction stopping information to the 
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audio decoder control data buffer 2954 of the synchro- 
nizer 2900. 

[0743] When the time supplied from the STC genera- 
tor 2950 matches the audio reproduction stopping time 
1 (VOB_A_STP_PTM1) stored in the audio decoder 
control data buffer 2954, the audio decoder controller 
2952 stops the audio decoder 3200 for the audio repro- 
duction stopping period 1 (VOB_A_GAP_LEN1) stored 
in the audio decoder control data buffer 2954. Likewise 
when the time supplied from the STC generator 2950 
matches the audio reproduction stopping time 2 
(VOB_A_STP_PTM2) stored in the audio decoder con- 
trol data buffer 2954, the audio decoder controller 2952 
stops the audio decoder 3200 for the audio reproduction 
stopping period 2 (VOB_A_GAP_LEN2) stored in the 
audio decoder control data buffer 2954. 
[0744] By thus comprising a STC generator 2950 and 
audio decoder controller 2952, the synchronizer 2900 is 
able to process audio reproduction gaps contained in 
the system stream of a multi-scene period when con- 
necting a stream from a multi-scene period with a com- 
mon scene stream. 

[0745] Note that an audio reproduction gap may occur 
in the present invention in one or both of VOB 6 and 
VOB 7 corresponding to scenes 6 and 7 in a parental 
lock control scene period as shown in Fig. 21 . 
[0746] The decoding process executed by the decod- 
ing system controller 2300 of the present invention is 
described briefly below with reference to Fig. 60, Fig. 
61. Fig. 62. Fig. 63, and Fig. 64. The process executed 
by the audio decoder controller 2952 of the present 
invention is then described with reference to Fig. 57. 
[0747] In Fig. 60 the title selected by the user is 
extracted from the multimedia bitstream MBS stored to 
the digital video disk and the VTS_PGCI #i program 
chain (PGC) data for reproducing the selected title is 
extracted by the decoding system controller 2300 of the 
DVD decoder DCD at step #310214. Then at step 
#310216 the selected title is reproduced based on the 
extracted VTS_PGCI #i program chain (PGC) informa- 
tion. The process shown in Fig. 60 has already been 
described in detail above, and further description 
thereof is thus omitted below. 

[0748] The process of reproducing the VTS_PGCI #i 
program chain in step #310216, Fig. 60, is shown in Fig. 
61 and described below. 

[0749] At step #31030 the decoding system table 
shown in Fig. 58 is set. The transfer process to the 
stream buffer 2400 (step #31032), and the data decod- 
ing process in the stream buffer 2400 (step #31034) are 
executed in parallel. Note that the process of step 
#31032 is based on the cell reproduction information in 
the PGC information entries C_PBI #j. The process 
shown in Fig. 61 has already been described in detail 
above, and further description thereof is thus omitted 
below. 

[0750] The stream buffer data transfer executed for 
each cell reproduction information entry (PGC informa- 



tion entries C_PBI #j) by the process of step #31032 is 
described in further detail below referring to Fig. 62. 
Because a parental lock control scene is being proc- 
essed in the present embodiment, step #31040 of Fig. 
5 62 returns NO, and the procedure moves to step 
#30144. The process shown in Fig. 62 has already been 
described in detail above, and further description 
thereof is thus omitted below. 

[0751] The non-murti -angle cell decoding process. 
10 i.e., the parental lock control cell decoding process exe- 
cuted as step #31044, Fig. 62, is described further 
below with reference to Fig. 63. Step #31050 evaluates 
the interleaved allocation flag IAF_reg to determine 
whether the cell is in an interleaved block. Because the 

75 seamless connection, parental lock control title proc- 
essed by the present embodiment is arrayed to an inter- 
leaved block, step #31050 routes control to step 
#31052. The process shown in Fig. 63 has already been 
described in detail above, and further description 

20 thereof is thus omitted below. 

[0752] The non-multi-angle interleaved block process 
(step #31052, Fig. 63) is described further below with 
reference to Fig. 64. At step #31062 the audio reproduc- 
tion stopping time 1 (VOB_A_STP_PTM1), the audio 

25 reproduction stopping period 1 (VOB_AJ3AP_LEN1), 
the audio reproduction stopping time 2 
(VOB_A_STP_PTM2), and the audio reproduction stop- 
ping period 2 (VOB_A_GAP_LEN2) are extracted as 
the table data from the DSI packet in the navigation 

30 pack NV (Fig. 20) and stored to the audio decoder con- 
trol data buffer 2954 (Fig. 56). The procedure then 
moves to step #31064 whereby VOB data transfer is 
continued until it is determined at step #31066 that all 
interleave units in the interleaved block have been 

35 transferred. 

[0753] The process executed by the audio decoder 
controller 2952 in Fig. 56 is described next with refer- 
ence to Fig. 57. 

[0754] At step #202301 the audio decoder controller 
40 2952 reads the audio reproduction stopping time 1 
(VOB_A_STP_PTM 1 ) from the audio decoder control 
data buffer 2954, and compares VOB_A_STP_PTM1 
with the system clock STC from the STC generator 
2950. If the values match, i.e., a YES is returned, the 
45 procedure moves to step #202302; if the values do not 
match, i.e., a NO is returned, the procedure moves to 
step #202303. 

[0755] At step #202302 the audio reproduction stop- 
ping period 1 (VOB_A_GAP_LEN1) is read from the 

so audio decoder control data buffer 2954, and the audio 
decoder 3200 is stopped for this period. 
[0756] At step #202303, the audio decoder controller 
2952 reads the audio reproduction stopping time 2 
(VOB_A_STP_PTM2) from the audio decoder control 

55 data buffer 2954, and compares VOB_A__STP_PTM2 
with the system clock STC from the STC generator 
2950. If the values match, i.e., a YES is returned, the 
procedure moves to step #202304; if the values do not 
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match, i.e., a NO is returned, the procedure returns to 
step #202301. 

[0757] At step #202304 the audio reproduction stop- 
ping period 2 (VOB_A_GAP_LEN2) is read from the 
audio decoder control data buffer 2954, and the audio s 
decoder 3200 is stopped for this period. 
[0758] The audio reproduction stopping time informa- 
tion (VOB_A_STP_PTM and VOB_A_GAP_LEN) is 
thus written to the DSI packet of the navigation pack NV 
in the system stream. Based on this audio reproduction 10 
stopping time information, the DVD decoder DCD com- 
prising an audio decoder control data buffer 2954 and 
an audio decoder controller 2952 for controlling the 
audio stream decoding operation is able to process 
audio reproduction gaps found in parental lock control is 
scenes, i.e.. in system streams shared by plural differ- 
ent program chains as shown in Fig. 30. It is therefore 
able to prevent intermittent video reproduction (video 
freezing) and intermittent audio reproduction (muting) 
caused by a data underflow state in the video buffer or 20 
audio buffer resulting when one common system stream 
is connected to one of plural system streams branching 
from (following) or to (preceding) the one system 
stream. 

[0759] Note that while audio data is moved in audio 25 
frame units in the above embodiment, the same effect 
can be achieved if the audio frames are broken into 
smaller units used as the movement unit to connect and 
contiguously reproduce system streams. 
[0760] Furthermore, while video data is moved in 30 
GOP units according to the second system stream pro- 
duction method in the above embodiment, the same 
effect can be achieved if the GOP units are broken into 
smaller units used as the movement unit to connect and 
contiguously reproduce system streams. 35 
[0761] Furthermore, while only audio data is moved 
according to the first system stream production method 
in the above embodiment, the same effect can be 
achieved rf video data is also moved from the system 
stream preceding the connection to the system stream 40 
following the connection. 

[0762] The present embodiment has also been 
described with reference to only one video stream and 
one audio stream, but the invention shall not be so lim- 
ited. 45 
[0763] While the present embodiment has been 
described with particular reference to branching and 
connecting streams as used to implement a parental 
lock control feature, seamless contiguous reproduction 
can also be achieved in multi-angle scene periods in so 
which the plural video streams provide different per- 
spectives (views) of the same title content, and using 
multimedia optical disks to which system streams con- 
figured as described above are recorded. 
[0764] The second system stream production method ss 
described above is described as being used at connec- 
tions from one of plural system streams to a single com- 
mon system stream in the present embodiment. 



However, the same effect can be achieved using the first 
system stream production method described .above 
when the same audio information is recorded to system 
streams not shared between different program chains. 
[0765] The present embodiment was also described 
using a digital video disk DVD, but the same effect can 
be achieved using other optical disks recording system 
streams having the same data structure as that of the 
present embodiment described above. 
[0766] With the audio and video data interleaving 
method of the present embodiment the audio data input 
by the decoding time includes only the data used in the 
next audio decode operation and any remainder from 
the packet transfer operation (approximately 2 KB). 
However, insofar as an audio buffer underflow state 
does not occur, i.e., insofar as the interleaving method 
interleaves the audio and video data to transfer audio 
data in a quantity and frequency preventing an audio 
buffer underflow state, the same effect can be achieved. 
[0767] Information relating to the audio reproduction 
gap at a system stream branch is written to the audio 
reproduction stopping time 1 (VOB_A_STP_PTM1) and 
the audio reproduction stopping period 1 
(VOB_A_GAP_LEN1) fields of the navigation pack NV 
in the present embodiment, but this audio reproduction 
gap information may be written to the audio reproduc- 
tion stopping time 2 (VOB_A_STP_PTM2) and the 
audio reproduction stopping period 2 
(VOB_A_GAP_LEN2) fields. 

[0768] Information relating to the audio reproduction 
gap at a system stream connection is written to the 
audio reproduction stopping time 2 
(VOB_A_STP_PTM2) and the audio reproduction stop- 
ping period 2 (VOB_A_GAP_LEN2) fields of the naviga- 
tion pack NV in the present embodiment, but this audio 
reproduction gap information may be written to the 
audio reproduction stopping time 1 
(VOB_A_STP_PTM1) and the audio reproduction stop- 
ping period 1 (VOB_A_GAP_LEN1) fields. 
[0769] The difference between the input end times to 
the respective buffers of the audio and video data in the 
system stream is defined as at most the reproduction 
time of two audio frames in this embodiment. However, 
if the video is encoded with variable bit rate (VBR) cod- 
ing and the video bit rate before the connection is 
reduced, the same effect can be achieved even when 
the input start time of the video data to the video buffer 
is advanced. 

[0770] The difference between the input start times to 
the respective buffers of the audio and video data in the 
system stream is defined as at most the reproduction 
time of two audio frames in this embodiment. However, 
if the video is encoded with variable bit rate (VBR) cod- 
ing and the video bit rate before the connection is 
reduced, the same effect can be achieved even when 
the input end time of the video data to the video buffer is 
delayed. 

[0771 ] The present embodiment is also described as 
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accumulating one audio frame in the audio buffer when 
system streams are connected, but the present inven- 
tion shall not be so limited and the same effects can be 
achieved if a different audio buffer accumulation level is 
used insofar as an audio buffer overflow state is not 
induced. 

[0772] Furthermore, while video data is moved in 
GOP units in the above embodiment, if the video data 
input bit rates differ in the connected system streams, 
the same effect can be achieved by encoding the GOP 
to be moved at the input bit rate of the video data in the 
system stream to which the GOP is moved. 
[0773] The compressed audio and video streams are 
also used for data movement in the above embodiment, 
but the same effect can be achieved by first moving the 
data at the pre-encoded material level. 
[0774] Only one GOP is also moved in the above 
embodiment, but the same effect can be achieved by 
moving two or more, i.e., plural, GOP. 
[0775] It is therefore possible by means of the present 
invention thus described to reproduce system streams 
from different program chains as a single contiguous 
title without intermittent video presentation (freezing) or 
intermittent audio presentation (muting) when connect- 
ing and contiguously reproducing plural system streams 
from a multimedia optical disk recorded with the video 
packets and audio packets interleaved to a single sys- 
tem stream meeting the following conditions: 

(a) the difference between the input start time of the 
first video packet and the input start time of the first 
audio packet at the beginning of the system stream 
is less than the reproduction time of the number of 
audio frames that can be stored in the audio buffer 
plus one audio frame, and 

(b) the difference between the input end time of the 
last video packet and the input end time of the last 
audio packet at the end of the system stream is less 
than the reproduction time of the number of audio 
frames that can be stored in the audio buffer plus 
one audio frame. 

[0776] Using a multimedia optical disk recorded with 
a system stream containing plural scenario branches, 
i.e., plural system streams branching from a single sys- 
tem stream to which said plural system streams may 
connect, where at least the same audio content is 
recorded to one or more audio frames at the beginning 
of each of the plural system streams connecting to said 
single system stream, it is particularly possible to repro- 
duce plural scenario titles as single natural titles without 
stopping the video presentation (video freezing) at the 
system stream connection when connecting and contig- 
uously reproducing the system streams. 
[0777] Using a multimedia optical disk recorded with 
a system stream containing plural scenario connec- 
tions, i.e., plural system streams connecting to a single 
system stream following thereafter, where at least the 
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same video content is recorded to one or more video 
frames at the beginning of each of the plural system 
streams connecting to said single system stream end of 
each of the plural system streams connecting to said 

5 single system stream or beginning of the single system 
stream connecting to the plural system streams, it is 
particularly possible to reproduce plural scenario titles 
as single natural titles without stopping the video pres- 
entation (video freezing) at the system stream connec- 

w tion when connecting and contiguously reproducing the 
system streams. 

[0778] Video or audio buffer underflow states at sys- 
tem stream connections, i.e., intermittent video presen- 
tation (video freezing) or intermittent audio presentation 
is (audio muting), resulting from the time difference in the 
video and audio reproduction times of different repro- 
duction paths can also be prevented by means of a DVD 
reproduction apparatus wherewith audio reproduction 
gap information is recorded to the reproduction control 
20 information, and said audio reproduction gap informa- 
tion is used by an audio decoder controller to start and 
stop audio decoder operation appropriately. 
[0779] By inserting a time difference in the video and 
audio reproduction times of different reproduction paths 
25 as an audio reproduction gap in one system stream not 
shared by different program chains, problems created 
by system stream connections, i.e., across system 
streams, can be converted to a problem contained 
within a single system stream. It is therefore- possible to 
30 contain the audio reproduction gap information within 
the DSI packet of the system stream, thus writing both 
the audio reproduction gap and the audio reproduction 
gap information to a single system stream, and thereby 
simplifying the data structure. 
35 [0780] As a result, the present invention makes it sim- 
ple to reuse, i.e., share, system streams. 
[0781] Furthermore, because the audio reproduction 
gap is contained within a single system stream, the 
audio reproduction gap can be moved to any desirable 
40 position in the system stream. As a result, it is possible 
to move the audio reproduction gap to a silent or other 
audibly innocuous location. 

Industrial Applicablity 

45 

[0782] As is apparent from a method and an appara- 
tus according to the present invention for interleaving a 
bitstream to record the interleaved bitstream to a 
recording medium and reproduce the recorded bit- 

so stream therefrom is suitable for the application of an 
authoring system which can generate a new title by edit- 
ing a title constructed by bitstreams carrying various- 
information in accordance with the user's request, and 
is also suitable for a Digital Video Disk System, or DVD 

55 System being developed recently. 
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It follows a list of further embodiments of the invention: 
[0783] 

Embodiment 1 : An optical disc (M) for recording s 
one or a plurality of system streams (VOB) contain- 
ing audio data and video data, wherein audio data 
and video data of a plurality of system streams 
(VOB) recorded to the optical disc (M) are inter- 
leaved (multiplexed) such that a difference (Tbl - io 
Tad1) of the input start times of video data and 
audio data to a buffer (2600) in a video decoder and 
a buffer (2800) in an audio decoder is less than the 
reproduction time of a number of audio frames (Af) 
that can be stored in the audio buffer plus one is 
frame. 

Embodiment 2: An optical disc (M) for recording 
one or a plurality of system streams (VOB) contain- 
ing audio data and video data, 

wherein audio data and video data of a plu- 20 
rality of system streams (VOB) recorded to the opti- 
cal disc (M) are interleaved (multiplexed) such that 
a difference between the input start times of the 
video data and audio data to a buffer in the video 
decoder and a buffer in the audio decoder is less 25 
than the reproduction time of two audio frames (2 x 
Af). 

Embodiment 3: The optical disc (M) with the fea- 
tures of embodiment 1, wherein audio data and 
video data of a plurality of system streams (VOB) 30 
recorded to the optical disc (M) are interleaved 
such that a difference (Tvae - Taae) of the input end 
times of video data and audio data to a buffer 
(2600) in a video decoder and a buffer (2800) in an 
audio decoder is less than the reproduction time of 35 
a number of audio frames (Af) that can be stored in 
the audio buffer plus one frame (Af). 
Embodiment 4: The optical disc (M) with the fea- 
tures of embodiment 3, wherein audio data and 
video data of a plurality of system streams (VOB) 40 
recorded to the optical disc (M) are interleaved 
such that a difference of the input end times of 
video data and audio data to a buffer (2600) in a 
video decoder and a buffer (2800) in an audio 
decoder is less than the reproduction time of two 45 
audio frames (2 x Af). 

Embodiment 5: An optical disc (M) for recording 
one or a plurality of system streams (VOB) contain- 
ing audio data and video data, 

wherein when one or a plurality of system so 
streams (VOB) is shared by a plurality of program 
chains (VTS_PGC), and there are at least two dif- 
ferent system streams (VOB) reproduced immedi- 
ately after a system stream (VOB) shared by a 
plurality of program chains (VTS_PGC) that are not ss 
the same in each of the plurality of system streams 
(VOB), 



a same audio content is recorded to at least a 
first audio frame (Af) in at least two system 
streams (VOB) reproduced immediately after a 
system stream (VOB) shared by a plurality of 
program chains (VTS_PGC). 

Embodiment 6: The optical disc (M) with the fea- 
tures of embodiment 5, wherein when one or a plu- 
rality of system streams (VOB) is shared by a 
plurality of program chains (VTS_PGC), and there 
are at least two different system streams (VOB) 
reproduced immediately after a system stream 
(VOB) shared by a plurality of program chains 
(VTS_PGC) that are not the same in each of the 
plurality of system streams (VOB), 

a same video content is recorded to at least a 
first video frame (Vf) in at least two system 
streams (VOB) reproduced immediately after a 
system stream (VOB) shared by a plurality of 
program chains (VTS_PGC). 

Embodiment 7: The optical disc (M) with the fea- 
tures of embodiment 6 and embodiment 7, wherein 
when one or a plurality of system streams (VOB) is 
shared by a plurality of program chains 
(VTS_PGC), and there are at least two different 
system streams (VOB) reproduced immediately 
before a system stream (VOB) shared by a plurality 
of program chains (VTS_PGC) that are not the 
same in each of the plurality of system streams 
(VOB), 

a same video content is recorded to at least a 
last video frame (Vf) in at least two system 
streams (VOB) reproduced immediately before 
a system stream (VOB) shared by a plurality of 
program chains (VTS_PGC). 

Embodiment 8: The optical disc (M) with the fea- 
tures of embodiment 5, embodiment 6, or embodi- 
ment 7, wherein when one or a plurality of system 
streams (VOB) is shared by a plurality of program 
chains (VTS_PGC), and there are at least two dif- 
ferent system streams (VOB) reproduced immedi- 
ately before a system stream (VOB) shared by a 
plurality of program chains (VTS_PGC) that are not 
the same in each of the plurality of system streams 
(VOB), 

a same audio content is recorded to at least a 
last audio frame (Af) in at least two system 
streams (VOB) reproduced immediately before 
a system stream (VOB) shared by a plurality of 
program chains (VTS_PGC). 

Embodiment 9: The optical disc (M) with the fea- 
tures of embodiment 5, embodiment 6, embodi- 
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ment 7, or embodiment 8, wherein reproduction 
control information (NV) is provided in a system 
stream (VOB), and audio reproduction stop infor- 
mation (VOB_A_STP_PTM1 and 
VOB_A_GAP_LEN) are written to said reproduc- 5 
tion control information. 

Embodiment 10: A recording method for recording 
one or a plurality of system streams (VOB) contain- 
ing audio data and video data to an optical disc (M), 
whereby one or a plurality of system streams (VOB) 10 
are recorded with the audio data and video data 
interleaved such that a difference of the input start 
times of video data and audio data to a buffer 
(2600) in a video decoder and a buffer (2800) in an 
audio decoder is less than the reproduction time of is 
a number of audio frames (Af) that can be stored in 
the audio buffer plus one frame. 
Embodiment 11: The recording method with the 
features of embodiment 10, wherein one or a plural- 
ity of system streams (VOB) are recorded with the 20 
audio data and video data interleaved such that a 
difference of the input end times of video data and 
audio data to a buffer (2600) in a video decoder and 
a buffer (2800) in an audio decoder is less than the 
reproduction time of a number of audio frames (Af) 25 
that can be stored in the audio buffer (2800) plus 
one frame- 
Embodiment 12: A recording method for recording 
one or a plurality of system streams (VOB) contain- 
ing audio data and video data to an optical disc (M), 30 
whereby when one or a plurality of system streams 
(VOB) is shared by a plurality of program chains 
(VTS_PGC), and there are at least two different 
system streams (VOB) reproduced immediately 
after a system stream (VOB) shared by a plurality of 35 
program chains (VTS_PGC) that are not the same 
in each of the plurality of system streams (VOB), 

a same audio content is recorded to at least a 
first audio frame (Af) in at least two system 40 
streams (VOB) reproduced immediately after a 
system stream (VOB) shared by a plurality of 
program chains (VTS_PGC). 

Embodiment 13: The recording method with the 45 
features of embodiment 12, wherein when one or a 
plurality of system streams (VOB) is shared by a 
plurality of program chains (VTS_PGC), and there 
are at least two different system streams (VOB) 
reproduced immediately after a system stream so 
(VOB) shared by a plurality of program chains 
(VTS_PGC) that are not the same in each of the 
plurality of system streams (VOB), 

a same video content is recorded to at least a 55 
first video frame (Vf) in at least two system 
streams (VOB) reproduced immediately after a 
system stream (VOB) shared by a plurality of 



program chains (VTS_PGC). 

Embodiment 14: The recording method with the 
features of embodiment 12 and embodiment 13 
wherein when one or a plurality of system streams 
(VOB) is shared by a plurality of program chains 
(VTS_PGC), and there are at least two different 
system streams (VOB) reproduced immediately 
before a system stream (VOB) shared by a plurality 
of program chains (VTS_PGC) that are not the 
same in each of the plurality of system streams 
(VOB), 

a same video content is recorded to at least a 
last video frame (Vf) in at least two system 
streams (VOB) reproduced immediately before 
a system stream (VOB) shared by a plurality of 
program chains (VTS_PGC). 

Embodiment 15: The recording method with the 
features of embodiment 12, embodiment 13, or 
embodiment 14, wherein when one or a plurality of 
system streams (VOB) is shared by a plurality of 
program chains (VTS_PGC), and there are at least 
two different system streams (VOB) reproduced 
immediately before a system stream (VOB) shared 
by a plurality of program chains (VTS_PGC) that 
are not the same in each of the plurality of system 
streams (VOB), 

a same audio content is recorded to at least a 
last audio frame (Af) in at least two system 
streams (VOB) reproduced immediately before 
a system stream (VOB) shared by a plurality of 
program chains (VTS_PGC). 

Embodiment 16: The recording method with the 
features of embodiment 12, embodiment 13. 
embodiment 14, or embodiment 15, wherein repro- 
duction control information (NV) is provided in a 
system stream (VOB), and audio reproduction stop 
information (VOB_A_STP_PTlvl1 and 

VOB_A_GAP_LEN) are written to said reproduc- 
tion control information. 

Embodiment 1 7: An optical disc reproduction appa- 
ratus (DCD) comprising a data read means for 
reading reproduction control information from an 
optical disc (M) to which is recorded a system 
stream (VOB) containing reproduction control infor- 
mation declaring audio reproduction stop informa- 
tion (VOB_A_STP_PTM1 and VOB_A_GAP_LE N) , 
and 

an audio reproduction stop control 
(VOB_A_STP_PTM1 and VO B_ A_GA P_LE N ) 
means for stopping audio reproduction based 
on the read reproduction control information. 
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Embodiment 18: A method for reproducing data 
from an optical disc (M), whereby reproduction con- 
trol information (NV) is read from an optical disc (M) 
to which is recorded a system stream (VOB) con- 
taining reproduction control information declaring s 
audio reproduction stop information 
(VOB_A_STP_PTM1 and VOB_A_GAP_LEN), and 

audio reproduction stop (VOB_A_STP_PTM1 
and VOB_A_GAP_LEN) control stops audio io 
reproduction based on the read reproduction 
control information (NV). 

Claims 

is 

1 . An optical disc storing a bit stream containing video 
data unit, first audio data and second audio data 
which are presented during presentation of the 
video data, characterized in that 

20 

the bit stream contains an audio presentation 
discontinuous part between the first and sec- 
ond audio data, and 

the optical disc further stores audio gap infor- 
mation indicating a presentation time of the 25 
audio presentation discontinuous part. 

2. An optical disc according to claim 1 , characterized 
in that: 

30 

the video data has a presentation duration of 
one video frame or more, and each of the first 
and second audio data has a presentation 
duration of one audio frame or more; and 
the audio presentation discontinuous part has 35 
a duration less than one audio frame. 

3. An optical disc according to claim 1 or 2, character- 
ized in that: 

40 

the bit stream contains a plurality of data blocks 
with time stamps each indicating a presenta- 
tion time of a corresponding data block, each 
data block containing at least one of the video 
data, the first audio data, the second audio 45 
data, and the audio gap information; and 
the audio gap information indicates a time 
stamp of a data block including the audio pres- 
entation discontinuous part and the duration of 
the audio presentation discontinuous part. so 

4. An optical disc according to any one of the preced- 
ing claims, characterized in that: 

the bit stream includes a plurality of first system 55 
streams each including a part of the video data 
and the first audio data, a second system 
stream including a remaining part of the video 



data and the second audio data, and chain 
information indicating reproduction orders of 
the first system streams and the second sys- 
tem stream; 

the chain information indicates that one of the 
first system streams is alternatively presented 
and that the presentation of the second system 
stream follows the presentation of any one of 
the first system streams; 
the video data of the second system stream is 
presented seamlessly after the presentation of 
the video data of any one of the first system 
streams; and 

the presentation of the second audio data of 
the second system stream is discontinuous 
with the presentation of the first audio data of at 
least one of the first system streams. 

5. A reproduction apparatus for use in reproducing the 
optical disc of claim 1 , comprising: 

reading means for reading the bit stream and 

the audio gap information from the optical disc; 

a video decoder for presenting the video data 

contained in the read bit stream; 

an audio decoder for presenting the audio data 

contained in the read bit stream; 

a controller for controlling the reading means, 

the video decoder and the audio decoder, 

wherein the controller controls the audio 

decoder to suspend at a time indicated by the 

audio gap information. 

6. A reproduction method for reproducing the optical 
disc of claim 1 , comprising the steps of: 

reading the bit stream and the audio gap infor- 
mation from the optical disc; and 
presenting the video data, the first audio data 
and the second audio data contained in the 
read bit stream, 

wherein presentation of the first audio data and 
the second audio data is suspended at a time 
indicated by the audio gap information. 
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